Concurrency Control is a mechanism that is used to ensure that multiple transactions can access a database concurrently without conflicting with each other. It is a critical aspect of database systems because it allows multiple transactions to execute simultaneously without affecting the consistency of the data.
There are the several techniques that can be used to implement concurrency control, including:
🔏 Locking: It's a way to prevent data items from getting messed with by multiple transactions at the same time. Basically, when a transaction is using some data, we put a little lock on it so nobody else can come along and make changes until the first transaction is done. It's like putting a "Do Not Disturb" sign on your hotel room door when you're inside. So, locking is a pretty handy way to make sure everything stays organized and under control in database systems.
⏳ Timestamp ordering: This is a way to make sure transactions happen in the right order and don't run into conflicts. Basically, each transaction gets a special timestamp, kind of like a VIP pass to a concert. Then, the database system makes sure the transactions happen in the order of their timestamps so nobody gets their toes stepped on. It's like lining up for a rollercoaster - you don't want two people trying to sit in the same seat at the same time! So, with timestamp ordering, we keep everything neat and orderly in the database.
🎛️ Optimistic concurrency control: With this approach, we don't lock anything up - we just let transactions go about their business. It's like a party where everyone's dancing and having a good time without any bouncers checking VIP lists. But, we do have a little trick up our sleeve - we check for conflicts at the end of the transaction when it's time to commit. So, it's kind of like cleaning up after the party - we make sure nobody spilled any drinks or stepped on anyone's toes. That way, we can catch any conflicts and sort them out before they become a bigger problem.
2️⃣ Multi-version concurrency control (MVCC): T It’s a fancy technique that allows multiple versions of a shared resource to exist at the same time, which means that multiple processes can access and update the resource without getting in each other's way. This is a useful approach for keeping everything running smoothly and making sure that transactions don't get bogged down or conflict with each other.
📅 Conflict-free replicated data types (CRDTs): It's a pretty cool technique that allows you to replicate data across multiple nodes without running into conflicts. Basically, it uses special data structures and algorithms that ensure that any concurrent updates won't mess with each other. So, you can have multiple nodes accessing and updating the same data without any issues!
📶 Serializability: It's a technique used to ensure that multiple transactions in a database appear to be executed one after the other, as if they were happening in a serial way. This helps to maintain the consistency of the database and prevent any conflicts that might arise from concurrent execution of transactions.
📸 Snapshot Isolation: At the beginning of a transaction, a snapshot of the database is taken. This helps ensure that the transaction only sees a consistent version of the data, and any changes made by the transaction don't conflict with other transactions that are running at the same time. It's like taking a photo of the database at the beginning of the transaction so that everything remains consistent!
🕵️♀️ Deadlock detection and recovery: When multiple processes are stuck waiting for each other to give up a resource, and nothing can move forward. But don't worry, there's a technique called deadlock detection and recovery that comes to the rescue! It finds those pesky deadlocks and fixes them so your processes can keep on truckin'.
It's a way to prevent data items from getting messed with by multiple transactions at the same time. Basically, when a transaction is using some data, we put a little lock on it so nobody else can come along and make changes until the first transaction is done. It's like putting a "Do Not Disturb" sign on your hotel room door when you're inside. So, locking is a pretty handy way to make sure everything stays organized and under control in database systems.
Example Interview Question:
Consider a database with a table called 'Accounts' that has the following schema:
Accounts(account_number: INTEGER, balance: INTEGER)
There are two transactions, T1 and T2, that need to execute concurrently. T1 needs to transfer $100 from account A to account B, and T2 needs to transfer $50 from account B to account C.
🟣Describe how you would use concurrency control to ensure that these transactions are executed in a way that maintains the consistency of the data, and provide a brief explanation of your approach."
One possible solution is to use locking to ensure that the transactions are executed serially. T1 would acquire a lock on the rows for accounts A and B, and then T2 would have to wait until T1 releases the lock before it can acquire a lock on the rows for accounts B and C. This ensures that the transactions are executed in a way that maintains the consistency of the data, because T1 and T2 are not allowed to modify the same rows concurrently.
Video: Concurrency Control