Wait On Conflict Functional Spec
YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
#Wait-on-Conflict Concurrency Control in YSQL
#1 Introduction
This document aims to define fail-on-conflict concurrency control behavior in YSQL, wait-on-conflict behavior in PostgreSQL, and outline the requirements for changing YSQL's default concurrency-control semantics.
To understand wait-on-conflict concurrency control in PostgreSQL, it is important to take note of some points about row-level locking and its interaction with DMLs.
#PostgreSQL supports 4 types of row-level locks -
- Exclusive modes (i.e., only 1 transaction can hold such a lock at any time) -
FOR UPDATE- takes exclusive lock on the whole row (blocks the shared locks as well)FOR NO KEY UPDATE- takes exclusive lock on row but allows a shared lock on primary key i.e.,FOR KEY SHARE
- Share access modes
FOR SHARE- shared lock on full row i.e., blocks all exclusive locksFOR KEY SHARE- shared lock on key i.e., allowsFOR NO KEY UPDATE
#PostgreSQL has two notions of conflict -
- Lock-modification conflict: (modification is a change in tuple via a DML) (see example 2).
FOR UPDATE: doesn’t allow any concurrent modification to the locked tuple.FOR NO KEY UPDATE: doesn’t allow any concurrent modification to the locked tuple.FOR SHARE: doesn’t allow any concurrent modification to the locked tuple.FOR KEY SHARE: allow only modification to non-primary key cols
Lock-modification conflict only happens when the period from lock acquire to lock release intersects/overlaps with the read to commit time of the transaction performing the modification.
Lock-modification conflicts result in serialization errors. Note that the lock might be issued before or after the modification (see example 2i and 2ii).
Note that in READ COMMITTED isolation, when a lock-modification conflict occurs, PostgreSQL does some post-processing (called “READ COMMITTED update checking steps”) instead of throwing a 40001.
-
Lock-lock conflict: this conflict only refers to one lock blocking another i.e., just waiting behaviour. It doesn’t refer to a serialization error. In other words, two conflicting locks can only result in blocking and not in a serialization error (or “write” conflict as we roughly call it). Also note that these locks can be taken explicitly using a
SELECT FORor implicitly using a DML. So there can be 4 cases - implicit-implicit, implicit-explicit, explicit-implicit, explicit-explicit lock-lock conflicts.(see example 1 for explicit lock-explicit lock conflict).
DMLs implicitly take the following row-level locks:
DELETE->FOR UPDATEUPDATE->FOR UPDATEif updating a column which has a unique index on it (one that can be used in a foreign key)- All
UPDATEsother than those in point 2 ->FOR NO KEY UPDATE.
In other words, a DML operation consists of locking and modification of a tuple.
Given that writing = locking + modification, the definition of lock-modification conflict is more generic to think of for serialization errors (the usual write-write “conflict” we talk about is just a special case of this. A write-write conflict = implicit lock - modification conflict).
#Fail-on-Conflict in YSQL -
YSQL does not differentiate between a lock-lock conflict and a lock-modification conflict. In either case, such conflicts will be detected during conflict resolution. Conflict resolution in YSQL would not lead to any blocking, but would surely abort either itself or all conflicting transactions based on priorities of the transactions (which are randomly chosen). We call this behavior fail-on-conflict concurrency control.
#Wait-on-Conflict in PostgreSQL -
PostgreSQL uses wait-on-conflict concurrency control, where a transaction will wait upon encountering a lock-lock conflict in order to acquire the locks it needs and proceed.
Wait-on-conflict (In PostgreSQL): When a transaction T1 attempts to take a lock (either implicitly via a write or explicitly using SELECT FOR), it might be blocked by transactions that hold a conflicting lock (note again, either implicitly via a write or explicitly using SELECT FOR). The transaction will then wait for all lock-lock conflicting transactions to end before obtaining the lock. Once the other transactions end, either the lock will be taken or a lock-modification conflict will be detected and a serialization error will occur.
Some points to note are -
- Transaction might skip being added to the queue: If a transaction is trying to acquire a lock that doesn’t conflict with actively held locks but does conflict with a transaction that is already waiting to acquire a conflicting lock, it proceeds to take the lock. In short, the transaction skips being added to the queue itself (see example 3). This improves availability at the potential cost of starvation.
- Transaction already in queue might jump older transactions in the queue: Multiple transactions might be blocked to lock the same tuple and this results in a queue. When an active transaction ends, we check other transactions in the wait queue which the resolved transaction was blocking. Transactions which no longer conflict with other active transactions are unblocked to acquire locks and become active. Note that this implies that a transaction that conflicts with earlier transactions blocked in the queue might still be unblocked if it doesn’t conflict with active transactions. This again improves availability at the potential cost of starvation, and matches the behavior in PostgreSQL.
- In case
statement_timeout or lock_timeoutis non-zero, a blocked transaction will abort after the timeout. (example 4). A zero timeout would imply indefinite waiting.
NOTE: Refer this article for a good overview of row-level locking.
#2 Requirements
- Firstly, support lock-lock type of conflict that only blocks, in YSQL. This is to split the current notion of conflict in YSQL into the two finer notions that Postgres supports.
- In case a transaction T1 issues an RPC which tries to acquire a lock as part of a DML that conflicts with existing locks held by active transactions, the RPC should:
- wait for all transactions that have conflicting locks to end (waiting behaviour)
- throw a serialization error only if a modification that conflicts with the lock has committed
- respect client-specified timeouts and remove itself from the queue
- Implement a distributed deadlock detection algorithm to break cycles. The following properties are required -
- No false positives
- Bound on latency = O (cycle size)
- Add a metric to measure the “intensity” of starvation by measuring the number instances where a transaction jumps the wait queue ahead of other waiting transactions that it conflicts with just because it doesn’t conflict with active transactions.
Once wait-on-conflict concurrency control is supported, YSQL will have the same behaviour as Postgres with regards to transaction waiting behaviour in case of writing/locking, with two exceptions:
- For Serializable isolation level, there will be an extra YSQL-only behaviour of waiting in case a read (or write) intent is to be written that conflicts with an already existing write (or read) intent from another transaction. In PostgreSQL’s Serializable isolation implementation (i.e., SSI), reads don’t take implicit locks and hence don’t “lock” conflict with other writes. And hence, a read doesn’t wait on another write (and vice versa) in PostgreSQL’s Serializable isolation.
- YSQL writes fine grained column level intents in case of modifications to specific columns only. This will allow us to be more fine grained so that modifications to different columns of a row need not result in waiting (possibly followed by a serialization error). This is one aspect in which YSQL would turn out to be better than PostgreSQL - and semantically different in a hopefully beneficial way.
#3 Usage
enable_wait_queues: a cluster-level gflag to turn on wait-on-conflict behavior. This will require a cluster restart. Note that all transactions in the cluster will either use wait-on-conflict OR fail-on-conflict behavior, and mixed behavior is tolerable during restart/migration but otherwise generally not supported.
#Expected Behavior
- Explicit lock - Explicit lock interaction (only blocks)
- i) Lock followed by write (with explicit - implicit lock conflict),
ii) write followed by lock (implicit - explicit lock conflict followed by possible lock - modification serialization error if write commits) and
iii) write followed by write (explicit - explicit lock conflict followed by possible lock - modification serialization error if first write commits)
For each case, we will only explore the interaction of FOR SHARE along with an update
to a non-primary key column. Recall that FOR SHARE doesn’t allow any concurrent modification to the locked tuple.
- Lock followed by write
- Write followed by lock (i.e., results in implicit lock - explicit lock waiting followed by serialization error if write commits)
- Write followed by write
- Proof that a transaction T3 won’t block on another waiting transaction that T3 conflicts with, if T3 doesn’t conflict with an active transaction.
- Transaction aborts in case statement_timeout or lock_timeout is hit. Note that lock_timeout applies to both implicit and explicit locks; skipping examples for lock_timeout since it is exactly the same behaviour as for statement_timeout.
#4 Compatibility
#Cross feature interaction
- Interaction with savepoints. Rolling back a savepoint might result in release of a lock and hence resolve some lock-lock waits.
- READ COMMITTED isolation level (exactly as in PostgreSQL) has a dependency on wait-on-conflict concurrency control. To be precise, on facing a conflict, a transaction has to wait for the conflicting transaction to rollback/commit before retrying the statement. Once unblocked, the read committed session will operate on the newly-committed data when retrying the conflicting statement. If READ COMMITTED is used without wait-on-conflict, we will use an internal retry mechanism which differs slightly from PostgreSQL.
#Versioning and upgrades
This feature is upgrade and downgrade safe. When turning the gflag on/off, or during rolling restarts across versions with the flag “on” in the higher version, if some nodes have wait-on-conflict behavior enabled and some don’t, users will experience mixed (but still correct) behavior. A mix of both fail-on-conflict and wait-on-conflict traffic will result in the following additional YSQL specific semantics -
- If a transaction using fail-on-conflict sees transactions that have written conflicting intents -
- Behaviour is as today
- If a transaction uses wait-on-conflict and sees transactions that have written conflicting intents -
- Wait for all conflicting transactions to end (including any using fail-on-conflict semantics)
Quick Actions
Details
- Type
- Functional Spec
- Author
- yugabyte
- Slug
- yugabyte/design-wait-on-conflict-functional-spec
