Modeling Streamlet Inwards Tla+
I had provided an overview of the Streamlet blockchain protocol inward my lastly post, delight read that before reading this post on modeling Streamlet inward TLA+. Also if you lot desire to acquire an thought nearly TLA+, this may live a adept post to read.
Here I model the crash fault-tolerant version of Streamlet provided inward Appendix A of the paper. Modeling Byzantine nodes is rattling hard: How do you lot copy arbitrary demeanour of the nodes? How do you lot together with then promise to model cheque it? So I stick amongst the crash fault-tolerant version which is rattling similar to the Byzantine fault-tolerant version of the protocol. This version solely considers crash faults together with inward plough it tin tolerate upward to a minority of node failures.
The argue I started modeling was because I was suspicious how at that spot was no postulate for using the epoch unwrap cognition inward the protocol or the proof of security inward an asynchronous environment. The protocol employs solely the longest chain rule. Longest chain does non necessarily hateful the chain amongst the highest epoch. It cares nearly how many notarized blocks are at that spot inward the chain. As far every bit the epoch numbers is concerned, the solely affair the protocol cares nearly is to run across iii consecutive epoch blocks for finalization. The protocol doesn't say nearly monotonically increasing requirement for epoch numbers inward an asynchronous surroundings (again except for the finality dominion which requires 3 consecutive epoch blocks).
OK, let's acquire to it. For this modeling exercise, I captured my thought procedure spell developing the model, together with was able to banker's complaint downwardly about of my mistakes every bit well. So I volition introduce this every bit a dev-log style. I volition add together in-line comments inward parenthesis nearly how about of my initial decisions evolved/changed over time.
I was rusty amongst TLA+ every bit I hadn't modeled anything inward TLA+ for perhaps half dozen months. But I was able to acquire this working inward 3 hours or so.
Gathering my thoughts before modeling
I postulate a chain, together with I tin model that every bit a tuple. (I was start thinking the chain volition grow past times appending novel value/block to its tail. But together with then when modeling the FindMaxNotarizedChain I postulate to sometimes chop the latest value/block if it is non notarized. So I reversed the administration of the chain making it grow past times adding newest value/block to the head, together with chopped the latest block past times invoking Tail component subdivision on the tuple.)
For communication, the simplest affair is to travel a shared whiteboard Message modeled every bit a set. Processes broadcast messages past times adding their message to the Message Set, i.e., Msgs. To read messages, processes tin travel filtering on the Message Set to think all matching messages. The downwardly side is that all nodes volition run across the same message set, it volition live similar atomic delivery to all processes when a Message is added to the Message Set. In other words, if procedure j sees message m, whatsoever procedure k accessing the Message Set after j is guaranteed to run across thousand every bit well. But this is OK for the start model, I tin relax this supposition subsequently on. This volition also help decease along the model checking fourth dimension short.
Should I take away hold 2 whiteboards, ane for proposals, together with ane for votes? I tin acquire away amongst ane whiteboard. The vote messages tin also live amongst the same message format every bit the suggest messages. A reasonable format is [chain, e, sender-id].
I simply postulate ane type of process, because at that spot is no distinguished leader process. There volition live 2 actions to check.
The suggest activeness would live something similar this.
if epoch % due north = myid then
find bring upward chain to live the longest notarized chain seen thus far past times searching the whiteboard amongst a macro.
Add block to the chain, and
bcast <chain, e, sender-id>
The bcast would live done simply past times adding this message to the Msgs laid for anyone to see.
The vote activeness would live something similar this.
If len(chain+e)> len(any notarized chain)
then bcast yes for <chain, e, id>.
Is at that spot an gild to the votes? Not necessarily. If you lot detect a proposal for an epoch you lot pick, you lot tin shipping a vote for it. You tin fifty-fifty shipping duplicate vote, the Msgs Set volition ignore duplicate votes, because it is a set.
What should the epoch progression dominion be? For security verification, I am checking amongst together with asynchronous environment. The protocol doesn't fifty-fifty take away hold lock-step rounds requirement. A node does non hold back for hearing from other nodes to growth its epoch together with propose. The solely component subdivision where about constraint is imposed is when proposing, you lot take away hold to take away your bring upward every bit notarized chain, together with that doesn't impose constraint on how epochs are chosen. What nearly this?
While (TRUE)
with e \in Epochs
if e%n = self, Propose(e)
else if \E propose for e inward Msgs together with then Vote(e)
So a node picks whatsoever random e from Epochs, this tin brand us saltation also far inward time to come but it is ok, nosotros desire to model cheque this amongst whatsoever gild of epochs inward a completely asynchronous environment. This is a bold means to evidence the protocol, isn't it?
Delving a layer deeper
Finding the notarized chain volition require writing about macros that filter messages from the Message Set, i.e. Msgs. I suspect the difficult component subdivision of the protocol is finding the longest notarized chain to attach to. And spell nosotros are talking nearly chains inward Msgs, inward the initial soil Msgs comprise the beginning block for the nodes to create on.
What are the variables maintained at each process? I postulate to take away hold nChain: the maximum notarized chain length a procedure has seen. I gauge that is it, the procedure does non postulate to hold whatsoever other state. It may fifty-fifty live possible to derive this from Msgs together with travel it every bit such without assigning it to a variable. But assigning to variable nChain would live adept for me to take away hold something to scout for inward the debug mode, otherwise it would live rattling difficult to follow what is going on simply past times ogling Msgs.
FindMaxChain(e) is tricky to write. Then for finding max notarized chain, I postulate to also cheque for bulk senders endorsing the chain.
FindMaxChain== CHOOSE z \in {m.chain: thousand \in S}:
(\E x \in {m.chain: thousand \in S}: Len(x)>Len(z));
That was non bad, nut FindMaxNotarizedChain volition live hard.
macro FindMaxNotarizedChain(e) {
check if the MaxChain is notarized;
else maxChain= Tail(FindMaxChain);
}
This checks if the MaxChain is notarized, if non it is because the lastly block is non notarized; this is because the previous blocks were guaranteed to live notarized before proposing the lastly block. If that is the instance nosotros chop of the latest block, together with render the remainder of the chain. But this is genuinely incomplete. Consider this. There may be two dissimilar notarized chains c1 together with c2 both amongst length=l. Furthermore, our FindMaxChain returns c1, whose latest block is non notarized, thus nosotros chop off the caput together with render it. But unknown to travel the chain c2 amongst length l was notarized completely including the latest block.
We postulate to human face at all chains amongst Len(FindMaxChain) ane past times one, together with iterate on them. Jeez... But for directly I tin simply human face at ane of them, together with if the latest block is nonetheless non notarized, chop it together with solely render the Tail of the chain. This is non the right, consummate solution. I don't similar this much, but this volition live a liveness occupation at worst, together with non a security problem, thus I tin alive amongst this for my start model.
How tin I model the finality check?
if \E chain amongst 3 consecutive ending inward Msgs, together with \E chain' amongst 3 consecutive ending inward Msgs, together with chain' is shorter than chain,
then chain'-chopped-tail is a prefix of chain-chopped-tail.
The model
Here is the total start draft model. https://github.com/muratdem/PlusCal-examples/blob/master/Streamlet/str0.tla
I won't demo them here, but if you lot become to my TLA+ file, you lot volition notice at the goal of the model at that spot are a lot of Bait/fake Invariants I came upward with. These are simply thus I tin acquire TLA to spit out an execution trace, thus I tin cheque whether the model is doing anything at all, together with if thus doing things according to my expectations. Thanks to them I fixed many bugs similar waiting for a suggest message for an epoch before voting on it, insisting for the same epoch unwrap for the cardinality cheque inward FindMaxNotarizedChain, together with getting (Len(ExtractMsg(e).chain) = Len(nChain)+1) correct inward Vote macro.
Here is the Consistency belongings nosotros were interested inward checking. It says that if at that spot exists 2 processes i together with j, each ane amongst a chain that ends inward iii consecutive epoch blocks, together with then the shorter chain (assume it is nchain[i] WLOG) volition live a prefix of the longer chain. In other words, at that spot is finality upto the the lastly block of the chain.
Consistency == \A i,j \in Nodes: ThreeCons(nChain[i]) /\ ThreeCons(nChain[j]) /\ Len(nChain[i])=< Len(nChain[j])
=> \A k \in 1..Len(nChain[i]): nChain[i][k]=nChain[j][Len(nChain[j])-Len(nChain[i])+k]
The verdict: Yes, consistency works! Even though nosotros do non cheque for monotonicity of epochs inward voting or proposing, consistency (i.e., safety) nonetheless works. Here is a draw that satisfies consistency together with violates monotonicity of epochs due to asynchronous execution. I challenged the TLA+ to plough over me this counterexample using the BaitMonotonicity invariant. (Recall that nosotros concur the chain inward opposite order. So amongst the below, I am challenging the model checker to plough over me a chain that violates the monotonicity status inward a notarized chain. I highlighted the non-monotonic chain)
\A i \in Nodes: ThreeCons(nChain[i]) => \A k \in 2..Len(nChain[i]): nChain[i][k]< nChain[i][k-1]
\A i \in Nodes: ThreeCons(nChain[i]) => \A k \in 2..Len(nChain[i]): nChain[i][k]< nChain[i][k-1]
I don't know if you lot noticed but my status is genuinely a chip stronger than what is inward the theorem. I am taking the lastly block also into account. This stronger version nonetheless checks out, I think because all nodes run across all sent messages atomically/unequivocally thank you lot to the shared whiteboard Msgs set. As I wrote earlier, inward this model, if procedure j sees message m, whatsoever procedure k accessing the Message Set after j is guaranteed to run across thousand every bit well. In a time to come model, I volition add together per procedure inboxes to address this, together with cheque to run across how this stronger variant tin live violated there.
In the adjacent model I tin also add together monotonicity of epochs status past times requiring that if a node voted for epoch e block, it shouldn't vote for whatsoever proposal amongst epoch less than e. This is a local status to cheque at each process, together with it doesn't rely on synchrony assumptions anyway. So it is reasonable to add together this inward the adjacent model, together with explore how it tin makes the model simpler (I am hoping this volition solve the incompleteness occupation of FindMaxNotarizedChain) together with to a greater extent than efficient to model check.
0 Response to "Modeling Streamlet Inwards Tla+"
Post a Comment