Is This Consensus?

The specification for consensus is every bit follows. The kickoff 2 are security properties, the finally i a liveness property.
  • Agreement: No 2 node tin ship away commit unlike decisions.
  • Validity (Non-triviality): If all initial values are same, nodes must commit that value.
  • Termination: Nodes commit eventually.
Below I beak close whether ABD or chain replication solve consensus, in addition to whether it would hold upward possible to implement nation machine replication using them.

Does ABD solve consensus?



No, ABD does non solve consensus. I had written a summary of the ABD protocol inwards a 2012 post. And I had talked close why ABD is non consensus inwards a 2015 post. Below is a brusk recap of that followed yesteryear a give-and-take of whether ABD tin ship away nonetheless hold upward employed to solve the nation machine replication problem.

Consensus is prone to the FLP impossibility result, in addition to it may lose progress nether FLP conditions. In particular, for Paxos, if nosotros can't decide whether the incumbent leader has failed inwards an asynchronous environment, in addition to then nosotros may consider the dueling leader problem, which may travel on until failure detection in addition to consequently leader election stabilizes.

In contrast, ABD is non affected yesteryear the FLP result. This is because ABD is memoryless in addition to hedonistic. ABD is happy amongst unresolved, partial acceptances inwards the past. Heck, it volition completely overwrite a value that is accepted yesteryear all nodes if around other write comes amongst a higher timestamp.

In comparison, Paxos is obsessed amongst the past. For each consensus instance, Paxos clings onto the value a node has seen (with the highest ballot). In the kickoff in addition to instant stage of write inwards ABD at that spot is no rejection/restart of the operation. In Paxos, both inwards the kickoff stage in addition to instant phase, a leader/candidate may have a rejection in addition to goes dorsum to retry stage 1.

Since ABD is memoryless in addition to hedonistic, nosotros tin ship away non implement replicated nation machines (RSMs) amongst ABDs. But let's force on this a bit. So far, nosotros were treating timestamps every bit ballots inside the same instance. What if nosotros process the timestamps inwards ABD every bit slots inwards multiple instances of accepting values? By using "timestamp = counter + leaderID", nosotros tin ship away implement total club on slots in addition to remove hold linearizability in addition to rigid consistent reads amongst multiple clients.

Can nosotros in addition to then implement RSM over ABD player nodes using these slot numbers? No! Because ABD skips yesteryear around slot numbers every bit unresolved since it is ever eager to motility ahead inwards a hedonistic manner. ABD goes amongst the highest timestamp of bulk read, in addition to it is non possible to acquire dorsum to before slots in addition to resolve them inwards illustration of ambiguity. In ABD, the nodes remove hold independently, in addition to at that spot is no commit/resolution phase, in addition to the logs inwards the nodes diverge. We can't brand a resolution close a slot's final result fifty-fifty inwards God-view mode, where nosotros tin ship away await at the values inwards a bulk of the nodes (not all nodes, because upward to a minority of nodes may hold upward unavailable per fault-model). Let's say nosotros consider a value at node A, in addition to no value at the other nodes constituting bulk for this slot. The ambiguity that remains is every bit follows. Maybe node A was piece of job of the bulk that had this value in addition to the other nodes are non reachable now. Or mayhap A was the solely node that had received this value. We cannot decide the difference. If nosotros acquire amongst the kickoff possibility, it is possible that, around other God-view to a unlike bulk may divulge that indeed the contrary was the case.

In contrast,  Paxos has a commit stage that marks that the value for that slot is resolved in addition to finalized. Any commit (even read from i node's committed value) is a valid commit because the leader has observed that bulk has accepted/stored it. So it is guaranteed that other nodes volition also know (or learn) that commit. So Paxos relearns in addition to does non travel out whatever gap inwards the slot numbers piece committing, because those slots numbers acquire executed every bit they are committed.

As the closing word on ABD, nosotros should Federal Reserve annotation that ABD is nonetheless useful for storage in addition to linearizability, it solves the atomic storage problem. Here comes the departure betwixt stateless operations (register operations set in addition to get) versus stateful operations (commands inwards full general that mutate state, which yesteryear Definition depends on the nation they are invoked/executed). For storage, nosotros don't remove stateful operations. Using ABD nosotros accomplish linearizability, in addition to tin ship away serve strong-consistency reads via using ABD fifty-fifty amongst multiple clients.

Does chain replication solve consensus?



I had written upward a description of the chain replication protocol inwards a 2011 post. Chain replication employs a Paxos-backed configuration box that maintains the configuration/topology of the chain nodes, in addition to the chain nodes only replicate the values inwards a streamlined fashion. The beauty hither is that Paxos is kept out of the information path, thus it is non involved amongst whatever replication request. Paxos is employed inwards the command path, in addition to is consulted solely when a error happens.

Does chain replication solve consensus? I haven't seen this interrogation addressed inwards previous piece of job neither inwards the master chain replication work, nor inwards whatever of the followup work. The respond is, no, chain replication does non solve the consensus problem! This is a trickier signal to appreciate than the ABD case.

Chain replication does non violate agreement/safety property: for a given instance, no 2 nodes volition remove hold unlike commits because they re-create the caput of the chain. But chain replication volition violate progress for the consensus illustration inwards that slot if the chain topology changes. Let's say solely the caput in addition to around other node committed a value in addition to they died or acquire disconnected, in addition to every bit a final result the chain topology is reconfigured yesteryear the config-box. No other node tin ship away commit around other value for that illustration because the epoch-number has changed amongst the configuration determination from the config-box. This is both adept intelligence in addition to bad news. Safety is non violated but nosotros lost progress/termination for that slot: the remaining nodes are non able to resurrect in addition to resolve this detail consensus illustration to termination. So although chain replication solves consensus inwards the absence of failures, inwards the presence of failures it deserts the consensus illustration without culminating it to resolution in addition to moves on. After the config-box appoints a novel chain topology, the progress in addition to security are both satisfied for the side yesteryear side consensus illustration (with the incremented epoch number).

To recap, chain replication gets things resolved/finalized in addition to keeps the same log inwards the absence of faults, but inwards the presence of faults, the logs inwards participants may diverge. Consider a node that accepts a value, in addition to and then due to failure in addition to chain reconfiguration it has been pushed out of the chain. How does that node larn whether what it has accepted before it crushed is skipped over or finalized? There is no commit inwards chain replication (of shape the ack-backpropagation inwards the CRAQ optimization may piece of job every bit commit)... Even amongst the patently chain replication, nosotros tin ship away fighting that, that node is at i time an wrong node every bit marked yesteryear the config-box, thus nosotros don't help close its consistency. And if that node joins the chain again, it volition bring together every bit tail, larn the same log every bit the other nodes. From that perspective, in addition to yesteryear seeding off of the Paxos-backed config-box, nosotros tin ship away fighting that RSM tin ship away hold upward implemented over chain replication.

Acknowledgment

The agency to empathise these things is yesteryear sparring amongst colleagues. I am grateful for Ailidani Ailijiang in addition to Aleksey Charapko for the discussion. It is non slowly to argue close distributed systems --but is surely rewarding later the fact. It took us 2 or 3 animated give-and-take sessions over java to acquire to the bottom of this.

0 Response to "Is This Consensus?"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel