batur.dev

Data Consistency, Anomalies, and the Role of Isolation - Chapter 2

image πŸŒ„ Image artificially generated through: Sora

In the first part, we discussed the basic concepts of data consistency and potential anomalies, focusing on the most common anomaly types: Dirty Read and Non-repeatable Read. In this part, we will delve into the details of other frequently encountered anomalies: Write Skew, Read Skew, Lost Update and Phantom Read. If you're ready, let's get started.

πŸ“Œ Write Skew

πŸ“– Definition

Write Skew occurs when two separate transactions check the same condition and, believing the condition still holds, each proceeds to update data in a way that causes an overall violation of system integrity. Although each transaction appears to follow the business rules in isolation, the combined result breaks those rules.

πŸ§ͺ Example Scenario

Let’s revisit our smart building system. Suppose there’s a rule: "At any given time t, only one high-energy-consuming device can be active on each floor." This rule is intended to ensure energy safety. Device statuses are stored in a DEVICE_STATUS table.

Result: From a business rules perspective, multiple "active" devices now exist on the same floor β€” a violation of the rule. Each transaction alone seemed valid, but together they caused an integrity issue.

⚠️ Why It Happens

Write skew frequently occurs in condition-based updates where the database only locks the rows being modified. If decisions are made based on broader conditions (e.g., checking status of multiple records), and the predicate itself isn't locked, integrity risks arise.

πŸ”’ Isolation Level Protection

Isolation Level Write Skew Protection
Read Uncommitted ❌ Not prevented
Read Committed ❌ Not prevented
Repeatable Read ❌ Generally not prevented
Serializable βœ… Prevented

πŸ”Ž Note: Write skew stems from the lack of predicate locking. Since most databases only lock modified rows (not queried conditions), Serializable is the safest option.

πŸ› οΈ Additional Solutions (DB & Application Level)

  1. Using Serializable Isolation Locks the entire condition range using predicate locks. This prevents other transactions from making conflicting decisions.

  2. Enforcing Business Rules at the Application Layer

    • Locking all relevant rows with SELECT FOR UPDATE.
    • Locking the entire set that matches a business rule, not just specific rows.
  3. Using Unique Constraints (If Applicable) Some write skew scenarios can be avoided using unique constraints at the schema level.
    Example: UNIQUE(floor_id, status) ensures only one "active" device per floor.
    β†’ But not all write skew cases are solvable this way.

  4. Optimistic Locking Is Not Enough Mechanisms like @Version only track changes to a specific row. If two different rows are updated simultaneously, optimistic locking won’t detect conflicts.

βš–οΈ Pros and Cons

Approach (e.g., Serializable or Unique Constraint) Pros Cons
Strong data integrity guarantee βœ… Protects against logical rule violations ❌ Can impact performance under high concurrency
Constraint-based prevention βœ… Centralizes rules ❌ May reduce flexibility / scalability in some cases

πŸ“Œ Read Skew

πŸ“– Definition

Read Skew arises when a transaction reads multiple pieces of data that are logically connected, but these values were updated by another transaction at different points in time. As a result, the reading transaction sees a temporally inconsistent snapshot of the system state.

In contrast to write skew β€” which stems from updating independent records based on shared conditions β€” read skew is about partial reads of interrelated data that were modified together but read inconsistently.

πŸ§ͺ Example Scenario

Imagine that in our smart building system, an emergency evacuation decision is based on two environmental conditions:

If both conditions are met, an evacuation is automatically triggered.

The system stores temperature and CO readings as two separate records or sources.

Individually, both conditions are satisfied. But since Transaction A read temperature before the CO level was updated, it saw the temperature from the past and the CO level from the present β€” effectively mixing data from two different moments in time.

This is a read skew: the system made a decision based on temporally inconsistent data.

⚠️ Why It Happens

Read skew occurs when data that is expected to be logically synchronized (e.g., sensor measurements, financial aggregates, balance validations) is updated across multiple records β€” but not read atomically.

It’s especially problematic in systems using Snapshot Isolation (SSI or MVCC), because each record may reflect a different version if they were modified independently.

πŸ”’ Isolation Level Protection

Isolation Level Read Skew Protection
Read Uncommitted ❌ Not prevented
Read Committed ❌ Not prevented
Repeatable Read βœ… Partially protected
Serializable βœ… Fully protected
Snapshot Isolation ❌ Not protected

πŸ”Ž Note: Although Snapshot Isolation provides a consistent snapshot for each table or row, it doesn’t guarantee logical consistency across multiple related values.

πŸ› οΈ Application-Level Prevention Strategies

βš–οΈ Pros and Cons

Prevention Strategy Pros Cons
Serializable Isolation βœ… Ensures logical consistency ❌ Expensive, may impact concurrency
Logical validation w/ timestamps βœ… Application-controlled consistency ❌ Adds application complexity
Data modeling (row co-location) βœ… Natural atomicity ❌ May reduce data flexibility

πŸ“Œ Lost Update

πŸ“– Definition

A Lost Update occurs when two transactions read the same data and then both update it based on their initial read. The last transaction to commit overwrites the first one’s changes without knowing they even happened β€” resulting in a silent data loss.

Unlike read or write skew, lost update doesn’t just lead to inconsistency β€” it causes entire updates to disappear, potentially introducing serious logical errors.

πŸ§ͺ Example Scenario

In our smart building system, multiple applications or administrators can update a device’s operating schedule.

The two updates didn’t conflict logically, but due to both reading the same initial state and then committing independently, one of them was lost.

⚠️ Why It Happens

Lost updates usually occur in systems where updates are performed by read-modify-write cycles, and there’s no mechanism to detect whether the underlying data has changed in the meantime.

This is common in:

πŸ”’ Isolation Level Protection

Isolation Level Lost Update Protection
Read Uncommitted ❌ Not prevented
Read Committed ❌ Not prevented
Repeatable Read ⚠️ May prevent in some DBMS
Serializable βœ… Fully protected

πŸ”Ž Note: Whether Repeatable Read protects against lost updates depends on the database engine. Some support automatic locking, others do not.

πŸ› οΈ Solutions at DB and Application Level

  1. Optimistic Locking (e.g., @Version fields)
    Most ORMs (like Hibernate) allow you to add a version column to entities.
    β†’ If two transactions modify the same row, the second one fails with a version conflict.
    β†’ Requires retry logic.

  2. Pessimistic Locking (e.g., SELECT ... FOR UPDATE)
    Lock the row explicitly during the read, so no one else can modify it.
    β†’ Guarantees correctness but reduces concurrency.

  3. Manual Conflict Detection
    Store a hash or snapshot of the read state. On update, re-check if data has changed.
    β†’ Useful when custom conflict logic is needed.

βš–οΈ Pros and Cons

Solution Pros Cons
Optimistic Locking βœ… Lightweight, highly concurrent ❌ Conflict handling needed (manual retry)
Pessimistic Locking βœ… Safe, protects critical paths ❌ Can lead to deadlocks and contention
Serializable Isolation βœ… Guarantees data integrity ❌ High cost under load

πŸ“Œ Phantom Read

πŸ“– Definition

image πŸŒ„ Image artificially generated through: Sora

A Phantom Read occurs when a transaction executes the same query multiple times within its execution, but the result set changes because another transaction inserted or deleted rows that match the same condition.

This differs from non-repeatable reads, which involve changes to existing rows. Phantom reads involve new rows "appearing" or "disappearing" between queries β€” like a phantom.

πŸ§ͺ Example Scenario

In our smart building system, an energy optimization module queries for all rooms where the power consumption exceeds 1500W and applies energy restrictions to those rooms.

Although the query condition didn’t change, the result set did β€” a new "phantom" record appeared during the transaction.

⚠️ Why It Happens

Phantom reads are caused by predicate-based operations β€” where queries depend on conditions (e.g., WHERE clauses), not just specific row IDs.

Most isolation levels only lock the rows being read or written, not the entire condition range. Therefore, new rows matching the same predicate can be inserted or deleted by concurrent transactions.

πŸ”’ Isolation Level Protection

Isolation Level Phantom Read Protection
Read Uncommitted ❌ Not prevented
Read Committed ❌ Not prevented
Repeatable Read ❌ Not fully prevented (DBMS-dependent)
Serializable βœ… Fully prevented

πŸ”Ž Note: In some databases (e.g., MySQL InnoDB), Repeatable Read does not prevent phantom reads. Only Serializable reliably prevents them by locking predicate ranges.

πŸ› οΈ Technical Solutions

  1. Serializable Isolation
    Safest and simplest fix. Under this level, the system uses predicate locks or index range locks to block other transactions from inserting or deleting matching rows.

  2. Predicate Locking (DBMS-specific)
    Some databases (like PostgreSQL with SSI) support predicate locking without full serializability.

  3. Manual Guarding via Lock Tables or Semaphores
    Applications can define custom logic that prevents inserts/deletes into sensitive row sets during critical queries.

  4. Snapshot or Materialized View Techniques
    Read the initial result set once and avoid querying the database again during the transaction. Works in some read-only cases.

βš–οΈ Pros and Cons

Method Pros Cons
Serializable Isolation βœ… Guarantees full correctness ❌ May hurt scalability and throughput
Predicate Locking (SSI) βœ… Efficient for complex queries ❌ Not available in all databases
Manual Locking / App Guards βœ… Highly customizable ❌ Increases complexity and coupling
Snapshot/Materialization βœ… Avoids concurrent interference in reads ❌ Doesn’t reflect newly inserted data

🎯 End of the Part 2

Thus, we have covered all the types of anomalies we planned to discuss. In the next and final part of this series, I plan to dive deeper into database isolation levels and compare their effectiveness against the anomalies mentioned in the first two parts. This way, I will both compare the isolation levels as solutions and conclude the series with a summary of the topic. Until then, take care of yourself.

πŸ“š Further Reading

Resources that I used in this article and that I think you can benefit from too:

  1. Martin Kleppmann Designing Data-Intensive Applications.
    https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/

  2. PostgreSQL Documentation β€” Transaction Isolation and Consistency.
    https://www.postgresql.org/docs/current/transaction-iso.html

  3. SQL Server Documentation β€” Isolation Levels in SQL Server.
    https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql

  4. Hibernate ORM Documentation β€” Transaction and Concurrency Control.
    https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#locking

#database #distributed systems #software development