Sosp19 Verifying Concurrent, Crash-Safe Systems Amongst Perennial

This newspaper is yesteryear Tej Chajed (MIT CSAIL), Joseph Tassarotti (MIT CSAIL), Frans Kaashoek (MIT CSAIL), Nickolai Zeldovich (MIT CSAIL).

Replicated disk systems, such every bit file systems, databases, too key-value stores, involve both concurrency (to render high performance) too crash safety  (to continue your information safety). The replicated disk library is subtle, but the newspaper shows how to systematically argue virtually all possible executions using verification. (This move considers verification of a unmarried reckoner storage organization amongst multiple disk --not a distributed storage system.)

Existing verification frameworks back upward either concurrency (CertiKOS [OSDI ’16], CSPEC [OSDI ’18], AtomFS [SOSP ’19]) or crash security (FSCQ [SOSP ’15], Yggdrasil [OSDI ’16], DFSCQ [SOSP ’17]).

Combining verified crash security too concurrency is challenging because:
  • Crash too recovery tin interrupt a critical section,
  • Crash tin wipe in-memory state, and
  • Recovery logically completes crashed threads' operations. 

Perennial introduces 3 techniques to address these 3 challenges:
  • leases to address crash too recovery interrupting a critical section,
  • memory versioning to address crash wiping in-memory state, and
  • recovery helping to address problems due to interference from recovery actions.

The presentation deferred to the newspaper for the offset 2 techniques too explained the recovery helping technique.

To present that the implementation satisfies the high-level specification a forwards simulation is shown nether an abstraction relation. The abstraction relation maps the concrete/implementation dry ground to the high-level abstract specification state. Perennial adopted the abstraction relation as: "if non locked (due to an performance inwards progress), too therefore the abstract dry ground matches the concrete dry ground inwards both disks".

The work is "crashing" breaks the abstraction relation. To create this problem, Perennial separates crash invariant (which refers to interrupted spec operations) from the abstraction invariant. The recovery proof relies on the crash invariant to restore the abstraction invariant.


Crash invariant says "if disks disagree, roughly thread was writing the value on the offset disk". Then the recovery helping technique helps recovery commit writes from before the crash. The recovery proof shows the code restores the abstraction relation yesteryear completing all interrupted writes. As a consequence users larn right conduct too atomicity.

The Perennial proof framework was written inwards 9K lines of coq which provides crash reasoning: leases, retentivity versioning, too recovery helping. Perennial is built on transcend of Iris concurrency framework (for concurrency reasoning), which is built on transcend of Coq. (Iris:  R. Krebbers, R. Jung, A. Bizjak, J.-H. Jourdan, D. Dreyer, too L. Birkedal. The gist of higher-order concurrent separation logic. In Proceedings of the 26th European Symposium on Programming Languages too Systems, pages 696–723, Uppsala, Sweden, Apr. 2017.)

The authors receive got developed Goose for reasoning virtually Go implementations, but they also defer this to the paper. The developer writes Go code, too the Goose translator (written inwards 2K lines of Go code) translates this to Perennial proof, where it is car checked amongst Coq.

As evaluation of Perennial framework, they verified a mail server written inwards Go. They fence that compared to a verification inwards CSCSPEC [OSDI ’18] (their before verification framework), the verification inwards Perennial takes less endeavor too is done inwards less issue of lines of proof.


The software is available at https://chajed.io/perennial.

MAD questions


1. Is this an trial of a convergence refinement relation? 
In 2001, I was thinking on fault-tolerance preserving refinements every bit a graduate pupil working on graybox pattern of self-stabilization. The inquiry was that: If nosotros pattern fault-tolerance at the abstract, what guarantee produce nosotros receive got that afterward the abstract code is compiled too implemented inwards concrete, the fault-tolerance withal holds/works?

It is slowly to run across that fault-tolerance would hold out preserved yesteryear an "everywhere refinement" that preserves the abstraction relation (between concrete too abstract) at whatever state, including the states exterior the invariant states that are non reachable inwards the absence of faults. But the work is that exterior the invariant, the abstraction relation may non grip due to recovery actions beingness unlike than normal actions. That is pretty much the dilemma the Perennial move faced inwards verifying the recovery of replicated disks above.

OK, I said, let's relax the everywhere refinement to an "everywhere eventual refinement" too that would move for preserving fault-tolerance. Yes, it works, but it is non slowly to attempt out that the concrete is an everywhere eventual refinement of the abstract because in that place is a lot of liberty inwards this type of refinement, too non much of a construction to leverage. The proof becomes every bit difficult every bit proving fault-tolerance of the concrete from scratch. So, what I ended upward proposing was a "convergent refinement", where the actions of the concrete provides a compacted version of the actions of the abstract exterior the invariant. In other words, the forwards simulation exterior the invariant would hold out skipping states inwards the concrete. Perennial faced amongst the same dilemma chose to role a unlike abstraction relation. Whereas the convergence refinement sentiment is to continue the same abstraction relation but allow it to contract/skip steps inwards the computations exterior the invariant states. I wonder if this could hold out applicable inwards the Perennial problem.

My reasoning amongst going compacting steps inwards refinement exterior invariant was because it is safer than expanding the computation: if y'all present recovery inwards states inwards the abstract, yesteryear skipping steps (and non adding novel ones) the concrete is also guaranteed to save that recovery.

Here is the abstract of my 2002 newspaper on convergence refinement. I only checked too this newspaper only got xix citations inwards xix years. It did non historic menses good afterward getting a best newspaper abide by at ICDCS'02. In comparison, roughly of the papers nosotros wrote apace too published every bit brusk newspaper or every bit a workshop newspaper got to a greater extent than than 150-900 citations inwards less than 10 years. Citations is funny business.
Refinement tools such every bit compilers produce non necessarily save fault-tolerance. That is, given a fault-tolerant plan inwards a high-level linguistic communication every bit input, the output of a compiler inwards a lower-level linguistic communication volition non necessarily hold out fault-tolerant. In this paper, nosotros position a type of refinement, namely "convergence refinement", that preserves the fault-tolerance holding of stabilization. We illustrate the role of convergence refinement yesteryear presenting the offset formal pattern of Dijkstra’s little-understood 3-state stabilizing token-ring system. Our designs laid about amongst simple, abstract token-ring systems that are non stabilizing, too and therefore add together an abstract "wrapper" to the systems therefore every bit to attain stabilization. The organization too the wrapper are too therefore refined to obtain a concrete token-ring system, piece preserving stabilization. In fact, the 2 are refined independently, which demonstrates that convergence refinement is amenable for "graybox" pattern of stabilizing implementations, i.e., pattern of organization stabilization based alone on organization specification too without noesis of organization implementation details.

0 Response to "Sosp19 Verifying Concurrent, Crash-Safe Systems Amongst Perennial"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel