Understanding ACID in Database Engineering
In software engineering, ensuring data consistency and reliability is paramount. One foundational concept that helps achieve this is ACID, which stands for Atomicity, Consistency, Isolation, and Durability. ACID properties provide a set of guarantees that define how database transactions should behave to maintain data integrity. In this article, we’ll delve into each ACID property, explain its significance, and provide practical code examples using a popular relational database management system, PostgreSQL.
Transaction
A transaction is a sequence of one or more database operations (such as queries, inserts, updates, or deletes) that are treated as a single, indivisible unit of work. The primary purpose of a transaction is to ensure data integrity by guaranteeing that a series of related database changes either all succeed or all fail, leaving the database in a consistent state.
The lifespan of a transaction refers to the duration during which a transaction is active and can make changes to the database. Transactions have a well-defined beginning and end.
- Start of a Transaction: A transaction begins when it is initiated explicitly by the application or when a specific operation is performed that automatically starts a transaction (e.g., the execution of an SQL statement like
BEGIN
orSTART TRANSACTION
). - Execution of Database Operations: During its lifespan, a transaction can execute various database operations. These operations can include reading data from the database, modifying existing data, or adding new data.
- End of a Transaction: A transaction ends when one of the following occurs:
- A successful completion: If all database operations within the transaction are executed successfully, the transaction is typically committed. Committing a transaction makes all its changes permanent and durable in the database.
- An error or failure: If any operation within the transaction encounters an error or failure, the transaction is rolled back. Rolling back a transaction ensures that none of its changes are applied to the database, leaving it in its original state.
Atomicity
Imagine you’re transferring money from one bank account to another online. This operation involves two steps: deducting the money from one account and adding it to the other. Now, if something goes wrong in between, we want to ensure that neither account is left in an inconsistent state.
This is what we call “atomicity.” It ensures that a series of database operations either all succeed or, if one fails, everything is rolled back to its original state. Think of it as an all-or-nothing rule for transactions.
Suppose you have a banking application, and you want to transfer $100 from User A’s account to User B’s account. Here’s how you can ensure atomicity using SQL commands in PostgreSQL
-- Start a transaction
BEGIN;
-- Step 1: Deduct $100 from User A's account
UPDATE accounts SET balance = balance - 100 WHERE account_id = 'UserA';
-- Step 2: Add $100 to User B's account
UPDATE accounts SET balance = balance + 100 WHERE account_id = 'UserB';
-- Commit the transaction if both steps succeed
COMMIT;
-- Rollback the transaction if any step fails
ROLLBACK;
By following this approach, we guarantee that money transfers are atomic operations. Either the entire transfer occurs successfully, leaving both accounts updated, or nothing changes if any part of the transfer fails. This ensures data consistency and prevents partial or erroneous updates in the database.
Consistency
Imagine a library where books have unique ISBN numbers, and no two books should have the same number. Now, if someone tries to add a book with a duplicate ISBN, we need to ensure our library database stays consistent.
This is where “consistency” comes into play. It ensures that a transaction takes the database from one consistent state to another, respecting rules and constraints.
Consistency guarantees that a database transaction transforms the database from one consistent state to another consistent state. It enforces constraints, ensuring that data remains within specified rules and boundaries.
BEGIN;
-- Attempt to add a book with a duplicate ISBN
INSERT INTO library (isbn, title) VALUES ('978-0451524935', '1984');
INSERT INTO library (isbn, title) VALUES ('978-0451524935', 'Brave New World');
COMMIT;
In this case, the second INSERT
operation violates a UNIQUE constraint, causing the entire transaction to fail. PostgreSQL ensures that no inconsistent data is stored.
Isolation: Preventing Cross Talk
Isolation ensures that multiple concurrent transactions do not interfere with each other. It guarantees that each transaction appears to be executed in isolation, as if it were the only transaction running on the system. Various isolation levels, such as Read Uncommitted, Read Committed, Repeatable Read, and Serializable, control the degree of isolation.
Imagine a busy coffee shop with two baristas making lattes. Both are simultaneously preparing a latte for the same customer. You wouldn’t want one barista to interfere with the other’s work, right? Exactly!
Now, let’s isolate with code using the Serializable isolation level in PostgreSQL:
-- Set the isolation level to Serializable
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- Transaction 1: Deduct 10 items from stock
UPDATE products SET stock = stock - 10 WHERE product_id = 123;
-- Concurrent Transaction 2: Deduct 20 items from stock
UPDATE products SET stock = stock - 20 WHERE product_id = 123;
COMMIT;
With “Serializable” isolation, PostgreSQL ensures these transactions run in a way that avoids conflicts, keeping our data safe and sound.
Durability
Durability guarantees that once a transaction is committed, its changes will persist, even in the face of system failures. It ensures that the data will not be lost or corrupted.
In PostgreSQL, durability is guaranteed by default. When you commit a transaction, the changes are stored permanently, and PostgreSQL takes care of ensuring data durability.
Conclusion
ACID properties play a vital role in maintaining data integrity and reliability. By understanding and implementing these properties in your database systems, you can ensure that your data remains consistent, isolated from concurrent transactions, and durable even in the event of system failures. PostgreSQL and other relational database systems provide robust support for ACID transactions, making them a reliable choice for mission-critical applications.
Excerpts from my notes on Fundamentals of Database Engineering by Hussein Nasser