Data Consistency, Anomalies, and the Role of Isolation - Chapter 2
π Image artificially generated through: Sora
In the first part, we discussed the basic concepts of data consistency and potential anomalies, focusing on the most common anomaly types: Dirty Read and Non-repeatable Read. In this part, we will delve into the details of other frequently encountered anomalies: Write Skew, Read Skew, Lost Update and Phantom Read. If you're ready, let's get started.
π Write Skew
π Definition
Write Skew occurs when two separate transactions check the same condition and, believing the condition still holds, each proceeds to update data in a way that causes an overall violation of system integrity. Although each transaction appears to follow the business rules in isolation, the combined result breaks those rules.
π§ͺ Example Scenario
Letβs revisit our smart building system. Suppose thereβs a rule: "At any given time t, only one high-energy-consuming device can be active on each floor." This rule is intended to ensure energy safety. Device statuses are stored in a DEVICE_STATUS
table.
- Transaction A queries the devices on the 5th floor. Since all devices are "inactive", it sets one device to "active".
- Simultaneously, Transaction B checks the same floor, sees the same "inactive" state, and sets another device to "active".
- Both transactions commit successfully.
Result: From a business rules perspective, multiple "active" devices now exist on the same floor β a violation of the rule. Each transaction alone seemed valid, but together they caused an integrity issue.
β οΈ Why It Happens
Write skew frequently occurs in condition-based updates where the database only locks the rows being modified. If decisions are made based on broader conditions (e.g., checking status of multiple records), and the predicate itself isn't locked, integrity risks arise.
π Isolation Level Protection
Isolation Level | Write Skew Protection |
---|---|
Read Uncommitted | β Not prevented |
Read Committed | β Not prevented |
Repeatable Read | β Generally not prevented |
Serializable | β Prevented |
π Note: Write skew stems from the lack of predicate locking. Since most databases only lock modified rows (not queried conditions), Serializable is the safest option.
π οΈ Additional Solutions (DB & Application Level)
Using Serializable Isolation Locks the entire condition range using predicate locks. This prevents other transactions from making conflicting decisions.
Enforcing Business Rules at the Application Layer
- Locking all relevant rows with
SELECT FOR UPDATE
. - Locking the entire set that matches a business rule, not just specific rows.
- Locking all relevant rows with
Using Unique Constraints (If Applicable) Some write skew scenarios can be avoided using unique constraints at the schema level.
Example:UNIQUE(floor_id, status)
ensures only one "active" device per floor.
β But not all write skew cases are solvable this way.Optimistic Locking Is Not Enough Mechanisms like
@Version
only track changes to a specific row. If two different rows are updated simultaneously, optimistic locking wonβt detect conflicts.
βοΈ Pros and Cons
Approach (e.g., Serializable or Unique Constraint) | Pros | Cons |
---|---|---|
Strong data integrity guarantee | β Protects against logical rule violations | β Can impact performance under high concurrency |
Constraint-based prevention | β Centralizes rules | β May reduce flexibility / scalability in some cases |
π Read Skew
π Definition
Read Skew arises when a transaction reads multiple pieces of data that are logically connected, but these values were updated by another transaction at different points in time. As a result, the reading transaction sees a temporally inconsistent snapshot of the system state.
In contrast to write skew β which stems from updating independent records based on shared conditions β read skew is about partial reads of interrelated data that were modified together but read inconsistently.
π§ͺ Example Scenario
Imagine that in our smart building system, an emergency evacuation decision is based on two environmental conditions:
- Room temperature exceeds 60Β°C
- Carbon monoxide (CO) level exceeds 100 ppm
If both conditions are met, an evacuation is automatically triggered.
The system stores temperature and CO readings as two separate records or sources.
- Transaction A reads the temperature first and sees 65Β°C.
- Before it reads CO level, Transaction B updates the CO reading from 90 to 120 ppm and commits.
- Now Transaction A reads the CO value and sees the new value (120 ppm).
Individually, both conditions are satisfied. But since Transaction A read temperature before the CO level was updated, it saw the temperature from the past and the CO level from the present β effectively mixing data from two different moments in time.
This is a read skew: the system made a decision based on temporally inconsistent data.
β οΈ Why It Happens
Read skew occurs when data that is expected to be logically synchronized (e.g., sensor measurements, financial aggregates, balance validations) is updated across multiple records β but not read atomically.
Itβs especially problematic in systems using Snapshot Isolation (SSI or MVCC), because each record may reflect a different version if they were modified independently.
π Isolation Level Protection
Isolation Level | Read Skew Protection |
---|---|
Read Uncommitted | β Not prevented |
Read Committed | β Not prevented |
Repeatable Read | β Partially protected |
Serializable | β Fully protected |
Snapshot Isolation | β Not protected |
π Note: Although Snapshot Isolation provides a consistent snapshot for each table or row, it doesnβt guarantee logical consistency across multiple related values.
π οΈ Application-Level Prevention Strategies
- Read all related data at once using a transactional boundary.
- Store related fields in the same row or record, if appropriate, to force atomic visibility.
- Use explicit version tags or timestamps to validate that the data set belongs to the same moment in time.
- Where possible, promote Serializable Isolation for critical logic.
βοΈ Pros and Cons
Prevention Strategy | Pros | Cons |
---|---|---|
Serializable Isolation | β Ensures logical consistency | β Expensive, may impact concurrency |
Logical validation w/ timestamps | β Application-controlled consistency | β Adds application complexity |
Data modeling (row co-location) | β Natural atomicity | β May reduce data flexibility |
π Lost Update
π Definition
A Lost Update occurs when two transactions read the same data and then both update it based on their initial read. The last transaction to commit overwrites the first oneβs changes without knowing they even happened β resulting in a silent data loss.
Unlike read or write skew, lost update doesnβt just lead to inconsistency β it causes entire updates to disappear, potentially introducing serious logical errors.
π§ͺ Example Scenario
In our smart building system, multiple applications or administrators can update a deviceβs operating schedule.
- Transaction A reads the current schedule (e.g., 08:00β18:00), modifies the end time to 17:00, and prepares to commit.
- Simultaneously, Transaction B reads the same schedule, modifies the start time to 09:00, and commits its change.
- Then Transaction A commits β its update overwrites Transaction Bβs, unintentionally erasing the previously saved 09:00 start time.
The two updates didnβt conflict logically, but due to both reading the same initial state and then committing independently, one of them was lost.
β οΈ Why It Happens
Lost updates usually occur in systems where updates are performed by read-modify-write cycles, and thereβs no mechanism to detect whether the underlying data has changed in the meantime.
This is common in:
- Systems using Read Committed isolation
- Applications lacking row-level locking or optimistic concurrency control
π Isolation Level Protection
Isolation Level | Lost Update Protection |
---|---|
Read Uncommitted | β Not prevented |
Read Committed | β Not prevented |
Repeatable Read | β οΈ May prevent in some DBMS |
Serializable | β Fully protected |
π Note: Whether Repeatable Read protects against lost updates depends on the database engine. Some support automatic locking, others do not.
π οΈ Solutions at DB and Application Level
Optimistic Locking (e.g., @Version fields)
Most ORMs (like Hibernate) allow you to add aversion
column to entities.
β If two transactions modify the same row, the second one fails with a version conflict.
β Requires retry logic.Pessimistic Locking (e.g., SELECT ... FOR UPDATE)
Lock the row explicitly during the read, so no one else can modify it.
β Guarantees correctness but reduces concurrency.Manual Conflict Detection
Store a hash or snapshot of the read state. On update, re-check if data has changed.
β Useful when custom conflict logic is needed.
βοΈ Pros and Cons
Solution | Pros | Cons |
---|---|---|
Optimistic Locking | β Lightweight, highly concurrent | β Conflict handling needed (manual retry) |
Pessimistic Locking | β Safe, protects critical paths | β Can lead to deadlocks and contention |
Serializable Isolation | β Guarantees data integrity | β High cost under load |
π Phantom Read
π Definition
π Image artificially generated through: Sora
A Phantom Read occurs when a transaction executes the same query multiple times within its execution, but the result set changes because another transaction inserted or deleted rows that match the same condition.
This differs from non-repeatable reads, which involve changes to existing rows. Phantom reads involve new rows "appearing" or "disappearing" between queries β like a phantom.
π§ͺ Example Scenario
In our smart building system, an energy optimization module queries for all rooms where the power consumption exceeds 1500W and applies energy restrictions to those rooms.
- Transaction A runs the query and gets 3 rooms.
- While A is still executing, Transaction B inserts a new room with 1600W usage and commits.
- Transaction A re-runs the same query before it finishes and now sees 4 rooms.
Although the query condition didnβt change, the result set did β a new "phantom" record appeared during the transaction.
β οΈ Why It Happens
Phantom reads are caused by predicate-based operations β where queries depend on conditions (e.g., WHERE clauses), not just specific row IDs.
Most isolation levels only lock the rows being read or written, not the entire condition range. Therefore, new rows matching the same predicate can be inserted or deleted by concurrent transactions.
π Isolation Level Protection
Isolation Level | Phantom Read Protection |
---|---|
Read Uncommitted | β Not prevented |
Read Committed | β Not prevented |
Repeatable Read | β Not fully prevented (DBMS-dependent) |
Serializable | β Fully prevented |
π Note: In some databases (e.g., MySQL InnoDB), Repeatable Read does not prevent phantom reads. Only Serializable reliably prevents them by locking predicate ranges.
π οΈ Technical Solutions
Serializable Isolation
Safest and simplest fix. Under this level, the system uses predicate locks or index range locks to block other transactions from inserting or deleting matching rows.Predicate Locking (DBMS-specific)
Some databases (like PostgreSQL with SSI) support predicate locking without full serializability.Manual Guarding via Lock Tables or Semaphores
Applications can define custom logic that prevents inserts/deletes into sensitive row sets during critical queries.Snapshot or Materialized View Techniques
Read the initial result set once and avoid querying the database again during the transaction. Works in some read-only cases.
βοΈ Pros and Cons
Method | Pros | Cons |
---|---|---|
Serializable Isolation | β Guarantees full correctness | β May hurt scalability and throughput |
Predicate Locking (SSI) | β Efficient for complex queries | β Not available in all databases |
Manual Locking / App Guards | β Highly customizable | β Increases complexity and coupling |
Snapshot/Materialization | β Avoids concurrent interference in reads | β Doesnβt reflect newly inserted data |
π― End of the Part 2
Thus, we have covered all the types of anomalies we planned to discuss. In the next and final part of this series, I plan to dive deeper into database isolation levels and compare their effectiveness against the anomalies mentioned in the first two parts. This way, I will both compare the isolation levels as solutions and conclude the series with a summary of the topic. Until then, take care of yourself.
π Further Reading
Resources that I used in this article and that I think you can benefit from too:
Martin Kleppmann Designing Data-Intensive Applications.
https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/PostgreSQL Documentation β Transaction Isolation and Consistency.
https://www.postgresql.org/docs/current/transaction-iso.htmlSQL Server Documentation β Isolation Levels in SQL Server.
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sqlHibernate ORM Documentation β Transaction and Concurrency Control.
https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#locking