Crash-Only Software, Hotos'03

Here is the summary for in the concluding department of Ousterhout's "The operate of distributed state" paper.)

Motivation

Since crashes are unavoidable, software must last at to the lowest degree equally good prepared for a crash equally it is for a construct clean shutdown. But so --in the spirit of Occam's Razor-- if software is crash-safe, why back upward additional, non-crash mechanisms for shutting down? A frequent argue is the wishing for higher performance. For example, to avoid irksome synchronous disk writes, many UNIX file systems cache metadata updates inwards memory. As a result, when a UNIX workstation crashes, the file organisation reaches an inconsistent nation that takes a lengthy fsck to repair, an inconvenience that could remove hold been avoided past times shutting downwards cleanly. This captures the pattern tradeoff that improves steady nation performance at the expense of shutdown as well as recovery performance. But, if the toll of such performance enhancements is dependability, peradventure it's fourth dimension to reevaluate our pattern strategy.

The major do goodness of a crash-only pattern is the following: A crash-only organisation makes it affordable to transform every detected failure into component-level crashes; this leads to a uncomplicated error model, as well as components solely demand to know how to recover from ane type of failure. If nosotros nation invariants close the system's failure behaviour as well as brand such behaviour predictable, nosotros are effectively coercing reality into a modest universe governed past times well-understood laws. If nosotros don't exercise a crash-only organisation as well as permit whatsoever error as well as recovery behavior, the resulting laid of states becomes real large as well as complex due to many possible states.

Requirements of Crash-Only Software

To brand components crash-only, the crash-only approach requires that all of import non-volatile nation last kept inwards dedicated crash-only nation stores, leaving applications with only plan logic. Specialized nation stores (e.g., databases, file organisation appliances, distributed information structures, non-transactional hashtables, session nation stores, etc.) are much improve suited to create do nation than code written past times developers with minimal preparation inwards systems programming. Applications travel stateless clients of the nation stores, which allows them to remove hold simpler as well as faster recovery routines. (This requirement volition wound the organisation performance unavoidably.)

To brand a organisation of interconnected components crash-only, it must last designed so that components tin tolerate the crashes as well as temporary unavailability of their peers. Thus, the approach prescribes strong modularity with relatively impermeable ingredient boundaries, timeout-based communication as well as lease-based resources allocation, as well as self-describing requests that acquit a time-to-live as well as information on whether they are idempotent. Many Internet systems today remove hold to a greater extent than or less subset of these properties equally they are built from many heterogenous components, accessed over criterion request-reply protocols such equally HTTP, as well as serve workloads that consist of large numbers of relatively curt tasks that frame nation updates.

What tin travel incorrect here?

The newspaper mentions the next potential problem: The dynamics of loosely coupled systems tin sometimes last surprising. For example, resubmitting requests to a ingredient that is recovering tin overload it as well as travel far neglect again; so the RetryAfter exceptions supply an estimated time-to-recover. To foreclose reboot cycles as well as other unstable atmospheric condition during recovery, it is possible to quiesce the organisation when a laid of components is existence crash-rebooted. This tin last done at the communication/RPC layer, or for the organisation equally a whole. In our prototype, nosotros exercise a stall proxy inwards front end of the spider web tier to continue novel requests from entering the organisation during the recovery process.

Taking things further, I tin mean value of to a greater extent than scary scenarios. A ingredient restart may remove hold to a greater extent than or less fourth dimension and, hence, may trigger restarts or problems at other components that depend on this component. As a resultant ane restart may trigger a system-wide restart storm. The silver-lining inwards this cloud is that, hopefully, past times designing thr organisation to last crash-only, you lot volition testify for as well as notice these problems earlier deployment equally these cases volition last exercised often.

Another electrical charge is that the crash-only approach puts to a greater extent than or less burden on the developers. It requires the components are designed to last real loosely coupled. It requires the developer to write code to continue rail of lost requests, as well as retry the lost requests if they are idempotent, else either perform a rollback recovery or apply a compensating functioning for it or somehow tolerate the inconsistency. While these code are for the component-level recovery (and thankfully non system-wide recovery), the code may all the same grow likewise complex for the developer to handle. Again, this may last unavoidable, fault-tolerance comes with a cost.

My to a greater extent than or less other occupation organisation close the crash-only approach is a continuous cyclic corruption of components (this instance is dissimilar than "the recorruption of a ingredient during recovery" mentioned 2 paragraphs above). In this scenario, faults inwards ingredient A leak as well as corrupts ingredient B, as well as afterwards A restarts as well as corrects itself, this fourth dimension faults leak from ingredient B to re-contaminate/corrupt A. Rinse repeat the to a higher house loop, as well as nosotros remove hold a roughshod bicycle of on-going corruption/contamination. My advisor Anish Arora had a solution to this occupation for composition of self-stabilizing components. The solution used dependency graphs on the corruption as well as correction relations alongside components as well as accordingly prescribed to a greater extent than or less blocking wrappers to freeze contamination as well as interruption cycles. I institute that the authors of the crash-only approach has a simpler solution to this, which they supply inwards their recursive-restartability paper. Their persuasion is to offset endeavor recovery of a modest subset of components (say solely ane component), as well as if restart proves ineffective, the subsequent attempts recover progressively larger subsets. In other words this technique chases error through successive boundaries, as well as tin thus interruption the corruption cycle. It is non real efficient but it is uncomplicated as well as it industrial plant eventually.

The newspaper is realistic close the limitations of the crash-only approach, as well as does non pigment an all rosy picture. Here are to a greater extent than or less quotes from the newspaper on this:

Building crash-only systems is non easy; the fundamental to widespread adoption of our approach volition require employing the right architectural models as well as having the right tools.
We are focusing initially on applications whose workloads tin last characterized equally relatively short-running tasks that frame nation updates. Substantially all Internet services agree this description, inwards business office because the nature of HTTP has forced designers into this mold. We await at that topographic point are many applications exterior this domain that could non easily last cast this way, as well as for which deriving a crash-only pattern would last impractical or infeasible.
We await throughput to endure inwards crash-only systems, but this occupation organisation is secondary to the high availability as well as predictability nosotros await inwards exchange.

Comparison to self-stabilization

I mean value the crash-only approach is a particular (more blackbox as well as to a greater extent than scalable) instance of self-stabilizing organisation design.

The crash-only approach uses the ingredient crash abstraction to abstract away dissimilar fault-types as well as dissimilar arrival sequence of these faults. Similarly, the self-stabilization approach uses the arbitrary nation corruption abstraction to avoid the demand to characterize the effects of dissimilar types of faults. Of course of educational activity the deviation betwixt the stabilization approach as well as crash-only approach is that stabilization requires access to the code as well as requires figuring out the expert states of the organisation to converge to. In other words, stabilization is a chisel as well as crash-only is a large hammer. But, for scalability of adding fault-tolerance to large systems, you lot may last improve off using a hammer than a chisel.

State-store approach as well as keeping application logic stateless also connects dorsum to the self-stabilization approach. Building a stateless organisation is a footling means to brand a organisation self-stabilizing. If you lot don't remove hold whatsoever nation to corrupt, the organisation is trivially self-stabilizing. (Similarly, if the organisation is built to last soft-state, so the corrupted nation expires with fourth dimension as well as new/correct information is written inwards the novel state, so the organisation is over again easily shown to last self-stabilizing.)

Final remarks

The newspaper includes the next judgement inwards the conclusions section, which I mean value is a real expert summary/evaluation of the approach: "Writing crash-only components may last harder, but their uncomplicated failure behaviour tin brand the assembly of such components into large systems easier."

Yes, crash-only approach provides to a greater extent than or less compositionality of fault-tolerance because it uses restarts to transform arbitrary faults into crashes as well as each crash-only ingredient is written anticipating that it volition interact with other crash-only components so it tin tolerate crashes of other components. However, things are non e'er tardily inwards do as well as at that topographic point are caveats equally nosotros mentioned above. Some faults tin leak through the crash-only abstraction as well as contaminate other components. Also at that topographic point could last hidden/emergent interactions/couplings betwixt components that the developer needs to tune.