JSON & Advanced Data Types
Store flexible data with JSON and understand the nuances of MySQL data types.
Stage 1 introduced basic data types. This stage goes deeper: JSON for flexible schemaless data, CHAR vs VARCHAR vs TEXT for strings, ENUM for fixed sets, and TIMESTAMP vs DATETIME for time tracking. These choices affect storage, performance, and correctness.
Definition
MySQL JSON data type stores and queries semi-structured JSON data. MySQL provides functions like JSON_EXTRACT(), JSON_TABLE(), and -> / ->> operators to query JSON columns.
Required step: Make sure you understand CREATE TABLE and basic data types from Stage 1.
JSON Data Type
MySQL 8.0+ stores JSON natively with indexing support. JSON columns hold valid JSON documents and let you query specific keys without parsing the entire document.
Create a table with JSON
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
attributes JSON
);
INSERT INTO products (name, attributes) VALUES
('Laptop', '{"color": "silver", "ram": 16, "tags": ["portable", "powerful"]}'),
('Phone', '{"color": "black", "ram": 8, "tags": ["compact"]}');The attributes column stores JSON documents. Each row can have different keys. No fixed schema required for the JSON column.
Extract values from JSON
SELECT name,
attributes->>'$.color' AS color,
attributes->>'$.ram' AS ram
FROM products;The ->> operator extracts a value as text. $.color accesses the 'color' key. The -> operator returns JSON-typed values, ->> returns plain text.
Filter by JSON values
SELECT name FROM products
WHERE CAST(attributes->>'$.ram' AS UNSIGNED) > 10;The ->> operator returns text, so we CAST it to a number for correct numeric comparison. Without CAST, MySQL compares strings ('9' > '10' would be true). A generated column with an index gives better performance for frequent filters.
JSON functions
SELECT
JSON_ARRAY(1, 2, 3) AS my_array,
JSON_OBJECT('key1', 'value1', 'key2', 'value2') AS my_object,
JSON_LENGTH('{"a": 1, "b": 2, "c": 3}') AS key_count;JSON_ARRAY creates an array. JSON_OBJECT creates an object. JSON_LENGTH returns the number of elements in an array or keys in an object.
Performance tip:Generated Columns for JSON Indexing
You cannot directly index a JSON column. Instead, create a generated column that extracts the value, then index that column.
ALTER TABLE products
ADD COLUMN ram INT GENERATED ALWAYS AS (attributes->>'$.ram') STORED,
ADD INDEX idx_ram (ram);Now queries filtering by ram use the index instead of scanning every row.
CHAR vs VARCHAR vs TEXT
All three store strings, but they differ in storage, performance, and limits.
Reference:String Type Comparison
CHAR(n): Fixed-width. Always stores n characters, padding with spaces if shorter. Faster for fixed-length data (country codes, MD5 hashes). Max 255 characters.
VARCHAR(n): Variable-width. Stores only the actual content plus 1-2 bytes for length. Best for most text fields. Max 65,535 characters (total row size limit).
TEXT: Variable-width, up to 65,535 bytes. Can have a DEFAULT value if specified as an expression (e.g., DEFAULT ('hello')) since MySQL 8.0.13. Stored separately from the row (slower to read). Use for long content like descriptions, comments, articles.
ENUM
ENUM stores one value from a fixed list of allowed values. It stores the numeric index internally (1, 2, 3, ...) which makes it compact and fast.
ENUM column
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
status ENUM('pending', 'shipped', 'delivered', 'cancelled') NOT NULL
);
INSERT INTO orders (status) VALUES ('pending'), ('shipped');
-- INSERT INTO orders (status) VALUES ('unknown'); -- Error: invalid ENUM valueThe status column can only hold one of the four listed values. Inserting an unlisted value causes an error.
ENUM internal index
SELECT status, FIELD(status, 'pending', 'shipped', 'delivered', 'cancelled') AS numeric_index
FROM orders;ENUM values are stored as numeric indexes internally (pending=1, shipped=2, delivered=3, cancelled=4). FIELD() shows this index. This makes ENUM comparisons fast.
Add a new ENUM value
ALTER TABLE orders
MODIFY COLUMN status ENUM('pending', 'shipped', 'delivered', 'cancelled', 'returned')
NOT NULL;Adds 'returned' to the allowed values. You must include all existing values plus the new one in the MODIFY statement.
TIMESTAMP vs DATETIME
Both store date and time, but they differ in range, storage, and timezone behavior.
Key differences:TIMESTAMP vs DATETIME
DATETIME: Range 1000-01-01 to 9999-12-31. Stores as-is (no timezone conversion). 8 bytes. Supports DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP (must be specified explicitly; not implicit by default).
TIMESTAMP: Range 1970-01-01 to 2038-01-19. Converts to UTC on storage and back to the session timezone on retrieval. 4 bytes. Supports DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP.
Recommendation: Use DATETIME for dates that will not change (birth date, event date). Use TIMESTAMP for rows that track when they were created or last modified.
TIMESTAMP with auto-update
CREATE TABLE audit_log (
id INT AUTO_INCREMENT PRIMARY KEY,
action VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);created_at is set when the row is inserted. updated_at is set on insert AND automatically updated to the current time whenever the row is modified.
What is the main difference between CHAR and VARCHAR?
What is the difference between -> and ->> in JSON?
Why is TIMESTAMP limited to 2038?
Key Takeaways
- JSON columns store native JSON with indexing via generated columns.
- ->> extracts JSON values as text. -> returns JSON-typed values.
- CHAR is fixed-width (fast for fixed data). VARCHAR is variable-width (best for most text). TEXT is for long content.
- ENUM stores one value from a fixed list. It uses numeric indexes internally for fast comparisons.
- TIMESTAMP converts to UTC (auto-update capable, limited to 2038). DATETIME stores as-is (no timezone, auto-update available via ON UPDATE CURRENT_TIMESTAMP, range to 9999).
Ready to test your knowledge?
Take a Quiz