Life Beyond Distributed Transactions: An Apostate's Opinion
Pat Helland is 1 of the veterans of the database community. He worked on the Tandem Computers amongst Jim Gray. His tribute to Jim Gray, which gives a lot of insights into Jim Gray every bit a researcher, is worth reading 1 time to a greater extent than in addition to again.
This 2007 seat newspaper from Pat Helland is near extreme scalability inwards cloud systems, in addition to yesteryear its nature anti-transactional. Since Pat has been a strong advocate for transactions in addition to global serializability for most of his career, the championship is aptly named every bit an apostate's opinion.
This newspaper is really relevant to the NoSQL movement. Pat introduces "entity in addition to activities" abstractions every bit edifice primitives for extreme scalability cloud systems. He every bit good talks near at length near the remove to arts and crafts a proficient workflow/business-logic on overstep of these primitives.
Entity in addition to activities abstractions
Entities are collections of named (keyed) information which may live on atomically updated inside the entity but never atomically updated across entities. An entity lives on a unmarried machine at a fourth dimension in addition to the application tin exclusively manipulate 1 entity atomically. A termination of almost-infinite scaling is that this programmatic abstraction must live on exposed to the developer of work concern logic. Each entity has a unique ID, in addition to entities stand upward for disjoint sets of data.
Since yous can’t update the information across ii entities inwards the same transaction, yous remove a machinery to update the information inwards dissimilar transactions. The connecter betwixt the entities is via a message addressed to the other entity.
Activities contain the collection of reason inside the entities used to care messaging relationships amongst a unmarried partner entity. Activities give-up the ghost on rail of messages betwixt entities. This tin live on used to give-up the ghost on entities eventually-consistent, fifty-fifty when nosotros are express to produce the transaction on a unmarried entity. (Messaging notifies the other entity near this activity, in addition to the other entity may update its state.)
Key-value tuple concept widely employed inwards key-value stores inwards cloud computing systems is a proficient lawsuit of an entity. However, key-value tuples produce non specify whatever explicit "activities". Note that, if nosotros tin care to brand messages betwixt entities idempotent, so nosotros don't remove to give-up the ghost on activities for entities; thus entity+activities concept reduces to the key-value tuple concept.
In fact several developers invented on their ain dissimilar advertising hoc ways of implementing activities on overstep entities. What Pat is advocating is to explicitly recognize activities in addition to prepare a criterion primitive for implementing them to avoid inconsistency bugs.
An lawsuit of an activity is constitute inwards Google's Percolator paper which replaced MapReduce for creating Google's pagerank index. Percolator provides a distributed transaction middleware leveraging on BigTable. Each row is an entity every bit a transaction is atomic amongst observe to a row at whatever time. However, to laid upward a distributed transaction, the organisation should hollo back the reason of the transaction amongst observe to other involved rows, i.e., "activities". This Percolator metadata is 1 time to a greater extent than encoded every bit a form plain inwards that row inwards BigTable. Percolator logs the state, for example, main in addition to secondary locks inwards these fields. (See Figure v for amount list.) I gauge using coordination services such every bit Zookeper is every bit good about other agency of implicitly implementing activities.
Workflow is for dealing amongst doubtfulness at a distance
In a organisation which cannot count on atomic distributed transactions, the management of doubtfulness must live on implemented inwards the work concern logic. The doubtfulness of the termination is held inwards the work concern semantics rather than inwards the tape lock. This is only workflow. Think near the agency of interactions mutual across businesses. Contracts betwixt businesses include fourth dimension commitments, cancellation clauses, reserved resources, in addition to much more. The semantics of doubtfulness is wrapped upward inwards the behavior of the work concern functionality. While to a greater extent than complicated to implement than only using atomic distributed transactions, it is how the existent globe works. Again, this is only an declaration for workflow but it is fine-grained workflow amongst entities every bit the participants.
Concluding remarks
Systematic back upward for implementing the activities concept is notwithstanding lacking today. It seems similar this concept needs to address to a greater extent than explicitly in addition to to a greater extent than methodically to amend the NoSQL systems.
Workflow is every bit good the prescribed every bit the agency to bargain amongst the lack of atomic distributed transactions. Workflow requires the developer to intend difficult in addition to stimulate upward one's heed on the work concern logic for dealing amongst the decentralized nature of the process: fourth dimension commitments, cancellation clauses, reserved resources, etc. But, are at that spot whatever back upward for developing/testing/verifying workflows?
0 Response to "Life Beyond Distributed Transactions: An Apostate's Opinion"
Post a Comment