The Many Faces Of Consistency
This is a lovely newspaper (2016) from Marcos Aguilera together with Doug Terry. It is an tardily read. It manages to last technical together with enlightening without involving analysis, derivation, or formulas.
The newspaper talks close consistency. In distributed/networked systems (which includes pretty much whatever practical reckoner organisation today), information sharing together with replication brings upwardly a primal question: What should occur if a client modifies about information items together with simultaneously, or inside a small time, about other client reads or modifies the same items, perchance at a dissimilar replica?
The response depends on the context/application. Sometimes eventual consistency is what is desired (e.g., DNS), together with sometimes you lot ask strong consistency (e.g., reservations, accounting).
The newspaper points out ii dissimilar types of consistency, state consistency and operation consistency, together with focuses mainly on comparing/contrasting these ii approaches.
When I encounter the price nation versus functioning consistency, without reading the balance of the paper, my gut reaction is to conclude that this is a affair of abstraction. If you lot are implementing the system, you lot move at the nation consistency level. If you lot are exposing this to the clients, since they practise non ask to know the internals of the organisation together with would exclusively assist close the operations they invoke, thus you lot move at the operational consistency level. The nation consistency tin assist back upwardly together with supply operational consistency to the clients.
My gut reaction turns out to last pretty accurate, yet the newspaper goes into to a greater extent than details together with elaborates on about subtleties. I am glad to read close the ii approaches inwards to a greater extent than detail. I think, going forward, it would last of import to relate/bridge/reconcile these ii approaches.
A major subcategory of nation consistency is 1 defined past times an invariant ---a predicate on the nation that must evaluate to true. For example, inwards a concurrent program, a singly linked listing must non comprise cycles. As about other example, inwards a primary-backup organisation usual consistency invariant requires that replicas conduct maintain the same nation when at that spot are no outstanding updates.
It is possible to weaken/loosen invariants to include mistake bounds, probabilistic guarantees, together with eventual satisfaction.
Operation consistency has subcategories based on the dissimilar ways to define the consistency property.
Linearizability is a strong shape of consistency. Each functioning must appear to occur at an instantaneous betoken betwixt its start fourth dimension (when the client submits it) together with complete fourth dimension (when the client receives the response), together with execution at these instantaneous points shape a valid sequential execution. More precisely, at that spot must be a legal full gild T of all operations amongst their results, such that (1) T is consistent amongst the partial gild <, pregnant that if op1 finishes earlier op2 starts thus op1 appears earlier op2 inwards T, together with (2) T defines a right sequential execution.
Sequential consistency is weaker than linearizability. It requires that operations execute every bit if they were totally ordered inwards a way that respects the gild inwards which *each client* issues operations. More precisely, the partial gild < is this fourth dimension defined as: op1 < op2 iff both operations are executed past times *the same client* together with op1 finishes earlier op2 starts. In other terms, spell linearizability imposes across-clients system-level ordering, sequential consistency is content amongst imposing a client-centric ordering.
The side past times side examples pertain to systems that back upwardly transactions. Intuitively, a transaction is a parcel of 1 or to a greater extent than operations that must last executed every bit a whole. The organisation provides an isolation property, which ensures that transactions practise non significantly interfere amongst 1 another. There are many isolation properties: serializability, strong session serializability, order-preserving serializability, snapshot isolation, read committed, repeatable reads, etc. All of these are forms of functioning consistency, together with several of them (e.g., serializability, strong session serializability, together with order-preserving serializability aka strict-serializability) are of the sequential equivalence subcategory.
Snapshot isolation requires that transactions bear identically to the next reference implementation. When a transaction starts, it gets assigned a monotonic start timestamp. When the transaction reads data, it reads from a snapshot of the organisation every bit of the start timestamp. When a transaction T1 wishes to commit, the organisation obtains a monotonic commit timestamp together with verifies whether at that spot is about other transaction T2 such that (1) T2 updates about item that T1 also updates, together with (2) T2 has committed amongst a commit timestamp betwixt T1’s start together with commit timestamp. If so, thus T1 is aborted; otherwise, T1 is committed together with all its updates are applied instantaneously every bit of the fourth dimension of T1’s commit timestamp.
Another illustration is one-copy serializability where the replicated organisation must bear similar a unmarried node (no replicas!) together with provides serializability.
Read-my-writes requires that a read past times a client sees at to the lowest degree all writes previously executed past times the same client, inwards the gild inwards which they were executed. This is relevant when clients await to unwrap their ain writes, but tin tolerate delays earlier observing the writes of others.
Bounded staleness bounds the fourth dimension (or number of updates) it takes for writes to last seen past times reads: A read must encounter at to the lowest degree all writes that consummate d fourth dimension (or number of updates) earlier the read started.
However, the truth is multidimensional. While unsuitable for reasoning/verification, operational consistency has its advantages.
On which type of consistency to use, the newspaper suggests the following:
Notice that spell functioning consistency department listed to a greater extent than than a dozen consistency levels, the nation consistency department but named a duo invariant types, including a vaguely named usual consistency invariant. This is because the state-consistency is implementation specific, a whitebox approach. It is to a greater extent than suitable for the distributed systems designers/builders rather than users/clients. By restricting the domain to specific operations (such every bit read together with write), functioning consistency is able to process the organisation every bit a blackbox together with provides a reusable abstraction to the users/clients.
This is form of a lazy evaluation idea. You don't ever ask to expend work/energy to proceed the database consistent, you lot tidy upwardly the database exclusively when it is queried, exclusively when it needs to perform.
The "TAPIR: Building Consistent Transactions amongst Inconsistent Replication (SOSP'15)" newspaper also builds on this premise. Irene puts it this way:
And, of course, inwards gild non to learn out performance on the table, instead of imposing strong state-based consistency, Cosmos DB global distribution protocols employ several optimizations behind the drapery spell withal managing to supply the requested functioning consistency levels to the clients.
Thus, state-consistency withal plays a major exercise inwards Cosmos DB, from the total layer developers/designers perspective. The total layer backend squad uses TLA+ to seat together with cheque weakest state-consistency invariants that back upwardly together with imply the desired operational consistency levels. These weak invariants are efficient together with practise non require costly nation synchronization.
As I mentioned inwards my previous post, I volition postal service close TLA+/PlusCal translation of consistency levels provided past times Cosmos DB here. I also promise to beak close about of the state-consistency invariants together with efficiency techniques employed when I start describing the global distribution at the Cosmos DB total layer inwards my upcoming weblog posts.
These ii approaches tin last complementary similar ii sides of the same coin.
I retrieve it is easier to become from state-consistency of the organisation to inferring together with designing functioning consistency properties.
How close the other way? While checking functioning consistency requires analyzing an execution log, past times restricting the domain to specific operations (such every bit read together with write), it is possible to hit application independence together with process the organisation every bit a blackbox. The Jepsen library achieves that for distributed systems testing. If the organisation is a blackbox, or was developed without invariant/state consistency, this functioning consistency based testing approach tin assist inwards identifying problems. But it is withal unclear, how to infer/design nation consistency properties/invariants that tin assist inwards fixing the problem.
The evolution of amend tools for observability/auditability approaches (such every bit Retroscope) tin assist inwards bridging this gap.
Another endeavour to assist span the gap could last to identify/name together with practise an ontology of invariants used inwards nation consistency approaches. I don't know much move inwards that direction except this 1 from VLDB15.
2. Do these ii approaches converge at the eventual consistency position?
The newspaper states the following. "Operational eventual consistency is a variant of eventual consistency (a shape of nation consistency) defined using functioning consistency. The requirement is that each write last eventually seen past times all reads, together with if clients halt executing writes thus eventually every read returns the same latest value."
Eventual consistency is probable a natural convergence betoken for the nation together with functioning consistency. This reminds of the "Conflict-free replicated information types" paper.
3. Are terms/definitions close consistency consistent yet?
Consistency is an of import concept thus it emerged together with developed inwards dissimilar domains (distributed systems, databases, together with reckoner architecture) simultaneously. And of course of pedagogy dissimilar domains used dissimilar terminology together with confusion arose.
I am non surprised. Even inwards the distributed systems community, on the restricted theme of distributed consensus, researchers conduct maintain been using inconsistent terminology for many decades earlier consensus (pun intended) arose together with the terminology converged together with standardized.
Consistency definitions tin last overwhelming because at that spot are many parameters involved, fifty-fifty without considering transactions.
https://arxiv.org/pdf/1512.00168.pdf
Kyle Kingsbury late provided a simplified clickable map of major consistency models (including transactional consistency models).
https://jepsen.io/consistency
The newspaper talks close consistency. In distributed/networked systems (which includes pretty much whatever practical reckoner organisation today), information sharing together with replication brings upwardly a primal question: What should occur if a client modifies about information items together with simultaneously, or inside a small time, about other client reads or modifies the same items, perchance at a dissimilar replica?
The response depends on the context/application. Sometimes eventual consistency is what is desired (e.g., DNS), together with sometimes you lot ask strong consistency (e.g., reservations, accounting).
The newspaper points out ii dissimilar types of consistency, state consistency and operation consistency, together with focuses mainly on comparing/contrasting these ii approaches.
When I encounter the price nation versus functioning consistency, without reading the balance of the paper, my gut reaction is to conclude that this is a affair of abstraction. If you lot are implementing the system, you lot move at the nation consistency level. If you lot are exposing this to the clients, since they practise non ask to know the internals of the organisation together with would exclusively assist close the operations they invoke, thus you lot move at the operational consistency level. The nation consistency tin assist back upwardly together with supply operational consistency to the clients.
My gut reaction turns out to last pretty accurate, yet the newspaper goes into to a greater extent than details together with elaborates on about subtleties. I am glad to read close the ii approaches inwards to a greater extent than detail. I think, going forward, it would last of import to relate/bridge/reconcile these ii approaches.
1. State consistency
State consistency pertains to the nation of the organisation (which comprises of the electrical flow values of the information items). These are properties that the organisation should satisfy despite concurrent access together with the beingness of multiple replicas.A major subcategory of nation consistency is 1 defined past times an invariant ---a predicate on the nation that must evaluate to true. For example, inwards a concurrent program, a singly linked listing must non comprise cycles. As about other example, inwards a primary-backup organisation usual consistency invariant requires that replicas conduct maintain the same nation when at that spot are no outstanding updates.
It is possible to weaken/loosen invariants to include mistake bounds, probabilistic guarantees, together with eventual satisfaction.
2. Operation consistency
State consistency is express to the properties on state, but inwards many cases clients assist piddling close the organisation nation together with to a greater extent than close the results that they obtain from the system. This calls for a dissimilar shape of consistency, functioning consistency, which pertains to the demeanor that clients unwrap from interacting amongst the system.Operation consistency has subcategories based on the dissimilar ways to define the consistency property.
2.1 Sequential equivalence
This subcategory defines the permitted functioning results of a concurrent execution inwards price of the permitted functioning results inwards a sequential execution. Some examples are every bit follows.Linearizability is a strong shape of consistency. Each functioning must appear to occur at an instantaneous betoken betwixt its start fourth dimension (when the client submits it) together with complete fourth dimension (when the client receives the response), together with execution at these instantaneous points shape a valid sequential execution. More precisely, at that spot must be a legal full gild T of all operations amongst their results, such that (1) T is consistent amongst the partial gild <, pregnant that if op1 finishes earlier op2 starts thus op1 appears earlier op2 inwards T, together with (2) T defines a right sequential execution.
Sequential consistency is weaker than linearizability. It requires that operations execute every bit if they were totally ordered inwards a way that respects the gild inwards which *each client* issues operations. More precisely, the partial gild < is this fourth dimension defined as: op1 < op2 iff both operations are executed past times *the same client* together with op1 finishes earlier op2 starts. In other terms, spell linearizability imposes across-clients system-level ordering, sequential consistency is content amongst imposing a client-centric ordering.
The side past times side examples pertain to systems that back upwardly transactions. Intuitively, a transaction is a parcel of 1 or to a greater extent than operations that must last executed every bit a whole. The organisation provides an isolation property, which ensures that transactions practise non significantly interfere amongst 1 another. There are many isolation properties: serializability, strong session serializability, order-preserving serializability, snapshot isolation, read committed, repeatable reads, etc. All of these are forms of functioning consistency, together with several of them (e.g., serializability, strong session serializability, together with order-preserving serializability aka strict-serializability) are of the sequential equivalence subcategory.
2.2 Reference equivalence
Reference equivalence is a generalization of sequential equivalence. Examples of this occur oftentimes inwards database systems.Snapshot isolation requires that transactions bear identically to the next reference implementation. When a transaction starts, it gets assigned a monotonic start timestamp. When the transaction reads data, it reads from a snapshot of the organisation every bit of the start timestamp. When a transaction T1 wishes to commit, the organisation obtains a monotonic commit timestamp together with verifies whether at that spot is about other transaction T2 such that (1) T2 updates about item that T1 also updates, together with (2) T2 has committed amongst a commit timestamp betwixt T1’s start together with commit timestamp. If so, thus T1 is aborted; otherwise, T1 is committed together with all its updates are applied instantaneously every bit of the fourth dimension of T1’s commit timestamp.
Another illustration is one-copy serializability where the replicated organisation must bear similar a unmarried node (no replicas!) together with provides serializability.
2.3 Read-write centric
The read-write centric subcategory applies to systems amongst ii real specific operations: read together with write. Consistency is defined amongst abide by to the laid of writes that could conduct maintain potentially affected the read.Read-my-writes requires that a read past times a client sees at to the lowest degree all writes previously executed past times the same client, inwards the gild inwards which they were executed. This is relevant when clients await to unwrap their ain writes, but tin tolerate delays earlier observing the writes of others.
Bounded staleness bounds the fourth dimension (or number of updates) it takes for writes to last seen past times reads: A read must encounter at to the lowest degree all writes that consummate d fourth dimension (or number of updates) earlier the read started.
3. State versus functioning consistency
Operation consistency is, well..., operational! I was inculcated every bit exercise of my PhD preparation inwards distributed systems to avoid operational reasoning every bit it fails to move for concurrent execution. This is also what I learn my students inside the origin calendar week of distributed systems class, every bit well. Operational reasoning does non scale since at that spot are likewise many corner cases to check. For reasoning close distributed systems, nosotros exercise invariant-based reasoning, which lends itself amend to nation consistency.However, the truth is multidimensional. While unsuitable for reasoning/verification, operational consistency has its advantages.
On which type of consistency to use, the newspaper suggests the following:
"First, retrieve close the negation of consistency: what are the inconsistencies that must last avoided? If the response is most easily described past times an undesirable nation (e.g., ii replicas diverge), thus exercise nation consistency. If the response is most easily described past times an wrong outcome to an functioning (e.g., a read returns stale data), thus exercise functioning consistency.
A minute of import consideration is application dependency. Many functioning consistency together with about nation consistency properties are application independent (e.g., serializability, linearizability, usual consistency, eventual consistency). We recommend trying to exercise such properties, earlier defining an application-specific one, because the mechanisms to enforce them are good understood. If the organisation requires an application specific property, together with nation together with functioning consistency are both natural choices, thus nosotros recommend using nation consistency due to its simplicity."
Notice that spell functioning consistency department listed to a greater extent than than a dozen consistency levels, the nation consistency department but named a duo invariant types, including a vaguely named usual consistency invariant. This is because the state-consistency is implementation specific, a whitebox approach. It is to a greater extent than suitable for the distributed systems designers/builders rather than users/clients. By restricting the domain to specific operations (such every bit read together with write), functioning consistency is able to process the organisation every bit a blackbox together with provides a reusable abstraction to the users/clients.
4. What is actually behind the curtain?
Here is about other advantageous usecase for functioning consistency approach. It provides you lot an abstraction (i.e., a curtain, a veil) that you lot tin leverage inwards your implementation. Behind this curtain, you lot tin push clit tricks. The newspaper gives this elementary example."An interesting illustration is a storage organisation amongst iii servers replicated using bulk quorums, where (1) to write data, the organisation attaches a monotonic timestamp together with stores the information at ii (a bulk of) servers, together with (2) to read, the organisation fetches the information from ii servers; if the servers render the same data, the organisation returns the information to the client; otherwise, the organisation picks the information amongst the highest timestamp, stores that information together with its timestamp inwards about other server (to ensure that ii servers conduct maintain the data), together with returns the information to the client. This organisation violates usual consistency, because when at that spot are no outstanding operations, 1 of the servers deviates from the other two. However, this inconsistency is non observable inwards the results returned past times reads, since a read filters out the inconsistent server past times querying a majority. In fact, this storage organisation satisfies linearizability, 1 of the strongest forms of functioning consistency."
This is form of a lazy evaluation idea. You don't ever ask to expend work/energy to proceed the database consistent, you lot tidy upwardly the database exclusively when it is queried, exclusively when it needs to perform.
The "TAPIR: Building Consistent Transactions amongst Inconsistent Replication (SOSP'15)" newspaper also builds on this premise. Irene puts it this way:
"Today, systems that want to supply strong guarantees exercise Paxos (or, if you lot are hipper than me, RAFT), together with everyone else uses something cheaper. Paxos enforces a strict series ordering of operations across replicas, which is useful, but requires coordination across replicas on every operation, which is expensive.
What nosotros institute inwards the TAPIR projection is that Paxos is likewise strong for about strong organisation guarantees and, every bit a result, is wasting move together with performance for those systems. For example, a lock server wants usual exclusion, but Paxos provides a strict series ordering of lock operations. This agency that a lock server built using Paxos for replication is coordinating across replicas fifty-fifty when it is non necessary to ensure usual exclusion.
Even to a greater extent than interesting, a transactional storage organisation wants strictly serializable transactions, which requires a linearizable ordering of transactions but exclusively requires a partial ordering of operations (because non all transactions touching all keys). With about careful blueprint inwards TAPIR, nosotros are able to enforce a linearizable ordering of transactions amongst no ordering of operations."
5. What exercise practise these concepts play inwards Cosmos DB?
Cosmos DB provides five good defined functioning consistency properties to the clients: strong, bounded, session, consistent prefix, together with eventual consistency. These consistency models were chosen because they are practical together with useful every bit signaled past times their actual usage past times the customers for their production workloads.And, of course, inwards gild non to learn out performance on the table, instead of imposing strong state-based consistency, Cosmos DB global distribution protocols employ several optimizations behind the drapery spell withal managing to supply the requested functioning consistency levels to the clients.
Thus, state-consistency withal plays a major exercise inwards Cosmos DB, from the total layer developers/designers perspective. The total layer backend squad uses TLA+ to seat together with cheque weakest state-consistency invariants that back upwardly together with imply the desired operational consistency levels. These weak invariants are efficient together with practise non require costly nation synchronization.
As I mentioned inwards my previous post, I volition postal service close TLA+/PlusCal translation of consistency levels provided past times Cosmos DB here. I also promise to beak close about of the state-consistency invariants together with efficiency techniques employed when I start describing the global distribution at the Cosmos DB total layer inwards my upcoming weblog posts.
MAD questions
1. Is it possible to span the ii approaches?These ii approaches tin last complementary similar ii sides of the same coin.
I retrieve it is easier to become from state-consistency of the organisation to inferring together with designing functioning consistency properties.
How close the other way? While checking functioning consistency requires analyzing an execution log, past times restricting the domain to specific operations (such every bit read together with write), it is possible to hit application independence together with process the organisation every bit a blackbox. The Jepsen library achieves that for distributed systems testing. If the organisation is a blackbox, or was developed without invariant/state consistency, this functioning consistency based testing approach tin assist inwards identifying problems. But it is withal unclear, how to infer/design nation consistency properties/invariants that tin assist inwards fixing the problem.
The evolution of amend tools for observability/auditability approaches (such every bit Retroscope) tin assist inwards bridging this gap.
Another endeavour to assist span the gap could last to identify/name together with practise an ontology of invariants used inwards nation consistency approaches. I don't know much move inwards that direction except this 1 from VLDB15.
2. Do these ii approaches converge at the eventual consistency position?
The newspaper states the following. "Operational eventual consistency is a variant of eventual consistency (a shape of nation consistency) defined using functioning consistency. The requirement is that each write last eventually seen past times all reads, together with if clients halt executing writes thus eventually every read returns the same latest value."
Eventual consistency is probable a natural convergence betoken for the nation together with functioning consistency. This reminds of the "Conflict-free replicated information types" paper.
3. Are terms/definitions close consistency consistent yet?
Consistency is an of import concept thus it emerged together with developed inwards dissimilar domains (distributed systems, databases, together with reckoner architecture) simultaneously. And of course of pedagogy dissimilar domains used dissimilar terminology together with confusion arose.
I am non surprised. Even inwards the distributed systems community, on the restricted theme of distributed consensus, researchers conduct maintain been using inconsistent terminology for many decades earlier consensus (pun intended) arose together with the terminology converged together with standardized.
Consistency definitions tin last overwhelming because at that spot are many parameters involved, fifty-fifty without considering transactions.
https://arxiv.org/pdf/1512.00168.pdf
Kyle Kingsbury late provided a simplified clickable map of major consistency models (including transactional consistency models).
https://jepsen.io/consistency
0 Response to "The Many Faces Of Consistency"
Post a Comment