Skip to content

Isolation Levels & Character Sets

Control concurrent transaction behavior and store international text correctly with utf8mb4.

Stage 14 introduced transactions: START TRANSACTION, COMMIT, ROLLBACK. But what happens when two users modify the same rows at the same time? Isolation levels control how transactions see each other's changes. Separately, MySQL's "utf8" character set is actually a 3-byte encoding that cannot store emoji or many international characters. Understanding these two topics prevents real-world bugs that are hard to diagnose.

Definition

Transaction isolation levels control how concurrent transactions see each other's changes. The four standard levels are READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ (MySQL default), and SERIALIZABLE.

Required step: Make sure you understand transactions from Stage 14 and data types from Stage 1.

Why Isolation Levels Matter

When two transactions run concurrently and modify the same data, problems can occur: reading uncommitted changes, inconsistent reads within a transaction, or phantom rows appearing. Isolation levels define which of these problems are prevented.

Key concepts:Concurrency Problems

Dirty read: Transaction A reads a row that Transaction B modified but has not yet committed. If B rolls back, A read data that never existed.

Non-repeatable read: Transaction A reads the same row twice and gets different values because Transaction B modified and committed it between the two reads.

Phantom read: Transaction A runs the same query twice and gets a different number of rows because Transaction B inserted or deleted rows between the two executions.

Isolation Levels

MySQL supports four isolation levels, from least to most restrictive. Each prevents some concurrency problems but at the cost of performance.

Reference:Isolation Level Comparison

READ UNCOMMITTED: Allows dirty reads, non-repeatable reads, and phantom reads. Fastest but least safe. Rarely used.

READ COMMITTED: Prevents dirty reads. Allows non-repeatable reads and phantom reads. Used by PostgreSQL and Oracle as default.

REPEATABLE READ (MySQL default): Prevents dirty reads and non-repeatable reads. Allows phantom reads in standard SQL. InnoDB adds gap locks to prevent phantoms, so this level is safer in MySQL than the SQL standard implies.

SERIALIZABLE: Prevents all three problems. Transactions execute as if serialized (one after another). Slowest due to heavy locking.

Set isolation level

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

Sets the isolation level for the next transaction. This statement must be outside a transaction (before START TRANSACTION).

View current isolation level

SELECT @@transaction_isolation;

Returns the current session's isolation level. Default is REPEATABLE-READ.

Set for entire session

SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

Changes the isolation level for all subsequent transactions in this session. The setting persists until the session ends.

Guidelines:When to Use Each Level

REPEATABLE READ (default): Good for most applications. Provides consistent reads within a transaction. InnoDB's gap locks make it safer than the standard implies.

READ COMMITTED: Use when you need to see committed changes from other transactions immediately (e.g., dashboards, reporting). Prevents stale reads but allows non-repeatable reads.

SERIALIZABLE: Use only when data consistency is critical and you can accept the performance penalty (e.g., financial transfers, inventory management).

Character Sets and Collations

A character set defines how characters are stored as bytes. A collation defines how characters are compared and sorted. MySQL's "utf8" is actually utf8mb3 (3-byte) and cannot store emoji or characters outside the Basic Multilingual Plane. Always use utf8mb4.

Critical distinction:utf8 vs utf8mb4

utf8 (utf8mb3): 3-byte encoding. Stores most characters but cannot store emoji, many CJK characters, or any character requiring 4 bytes. Max 3 bytes per character.

utf8mb4: 4-byte encoding. Full Unicode support including emoji, all CJK characters, and every Unicode code point. This is what you actually want.

MySQL's "utf8" is a historical misnomer. It was created before the Unicode standard expanded to 4 bytes. The name was never corrected for backward compatibility. Always use utf8mb4.

Set character set at database level

CREATE DATABASE myapp
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

Creates a database with utf8mb4 encoding. All tables in this database will inherit this character set unless overridden. _ci means case-insensitive.

Set character set at table level

CREATE TABLE comments (
  id INT AUTO_INCREMENT PRIMARY KEY,
  body TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Overrides the database character set for this specific table. You can also set it per column.

Set character set at column level

CREATE TABLE products (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
);

Uses utf8mb4_bin (binary collation) for case-sensitive comparison. Different columns can have different collations in the same table.

Change character set on existing database

ALTER DATABASE myapp
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

Changes the default character set for new tables. Existing tables are not affected. Use ALTER TABLE to change individual tables.

Set connection charset

SET NAMES utf8mb4;

Sets the character set for client-server communication. This ensures the client, connection, and server all use the same encoding. Always set this when connecting.

Reference:Common Collations

utf8mb4_general_ci: Fast, less accurate sorting. Good for general use.

utf8mb4_unicode_ci: More accurate Unicode sorting. Better for international text.

utf8mb4_bin: Binary comparison (byte-by-byte). Case-sensitive. Good for passwords, codes, and case-sensitive lookups.

utf8mb4_0900_ai_ci: MySQL 8.0+ default. Uses Unicode 9.0 rules. Accent-insensitive, case-insensitive.

What is the default transaction isolation level in MySQL?

Why should you always use utf8mb4 instead of utf8 in MySQL?

Key Takeaways

  • Isolation levels control how transactions see each other's changes.
  • REPEATABLE READ (MySQL default) prevents dirty reads and non-repeatable reads.
  • READ COMMITTED is useful when you need to see committed changes immediately.
  • SERIALIZABLE is safest but slowest. Use only when strict consistency is required.
  • MySQL's "utf8" is actually 3-byte (utf8mb3). Always use utf8mb4 for full Unicode support.
  • Set CHARACTER SET utf8mb4 at database, table, or column level.
  • Use SET NAMES utf8mb4 when connecting to ensure consistent encoding.

Ready to test your knowledge?

Take a Quiz