Distributed Is Non Necessarily To A Greater Extent Than Scalable Than Centralized

Centralized is non necessarily unscalable! 

Many people automatically associate centralized amongst unscalable, in addition to distributed amongst scalable. And, this is getting ridiculous.

In the Spring semester, inwards my seminar class, a PhD educatee was pitching me a projection for distributed storage: syncing from telephone to work/home computers in addition to other phones. The pitch started amongst the judgement "Dropbox is unscalable, because it is centralized". I was flabbergasted, in addition to I asked a brace of times "Really? Do you lot truly claim that Dropbox is unscalable?". The educatee persisted in addition to kept repeating that "Dropbox has a bottleneck because it is a centralized storage solution, in addition to the distributed solution doesn't remove hold that bottleneck". I couldn't believe my ears.

Dropbox already proved it is scalable: It serves files for to a greater extent than than 200 ane grand 1000 users, who shop 1 billion files every 24 hours. That it has a centralized architecture hosted inwards the cloud doesn't enter unscalable. As far every bit I tin encounter at that topographic point is no bottleneck caused yesteryear Dropbox having a to a greater extent than centralized architecture.

(For those who desire to nitpick, I know Dropbox is non fully centralized; it uses AWS S3 for storage in addition to Dropbox-company servers for metadata management. Also, it employs information parallelism inwards the backend for scalability, but, on the spectrum, it is closer to a centralized architecture than a fully decentralized one.)

Distributed is non necessarily scalable!

Some people when faced amongst a work think, I know, I'll purpose distributed computing. Now they remove hold N^2 problems. -- @jamesiry
Here is the instant part. Distributing a organisation does non necessarily enter scalable. In fact, a fully decentralized architecture tin sometimes hold upward a disadvantage for scaling.

Consider Lamport's mutual exclusion (ME) algorithm presented inwards his seminal "Time, Clocks, in addition to the Ordering of Events inwards a Distributed System". This ME algorithm is fully decentralized, in addition to requires O(N) messages to hold upward exchanged inwards answer to ane ME request. The Lamport ME algorithm employs broadcasts to conk along all the nodes informed of all updates in addition to larn them on the same (more or less) state.

Now visit a centralized algorithm for ME: at that topographic point is a centralized coordinator; the nodes post their asking to the coordinator, in addition to the coordinator assigns ME accordingly. (For the literalist: You tin however remove hold causal ordering inwards the centralized algorithm. Just purpose VC when nodes communicate in addition to include VC inwards the asking messages.) The centralized ME algorithm is to a greater extent than scalable: solely 1 message is exchanged inwards answer to ane ME request. It has less drama in addition to it is easier to keep in addition to create over.

Single indicate of failure?

A distributed organisation is ane inwards which the failure of a reckoner you lot didn't fifty-fifty know existed tin homecoming your ain reckoner unusable. -- Leslie Lamport
A mutual reflex declaration nigh centralized solutions is that it constitutes a unmarried indicate of failure (SPOF). But if a distributed solution is non designed carefully, it volition remove hold multiple points of failures (MPOF). Which ane would you lot rather have?

Let's reconsider the Lamport ME in addition to the centralized ME algorithms. The distributed algorithm does non offering whatever fault-tolerance advantages. Both algorithms are prone to getting stuck amongst ane crash failure.

In fact, nosotros tin combat that it is easier to blueprint fault-tolerance to a centralized solution: You tin employ Paxos to replicate the centralized server. In contrast, it is frequently much harder to blueprint in addition to add together fault-tolerance to a distributed system. Since a distributed organisation is complex, it is to a greater extent than prone to innovate corner cases that jeopardize fault-tolerance.

Conclusion

Distributed is non necessarily to a greater extent than scalable than centralized;
And centralized is non necessarily a scalability bottleneck.

As a distributed systems professor, I wouldn't imagine myself defending centralized solutions. But at that topographic point it is.

To avoid potential misunderstandings, I am non proverb fully distributed/decentralized solutions are bad in addition to to hold upward avoided. There are advantages to decentralization, similar latency reduction. And approximately weather condition involve decentralization, similar geographic/political/corporate isolation. We know inwards the existent globe it is a mix of centralized, upward to where that is manageable in addition to has reasonable cost, in addition to approximately distributed architecture. This too depends rattling much on the application/task.

PS: Maybe nosotros should produce an XtraNormal animation movie nigh this "centralized unscalable in addition to distributed scalable" mania. Any takers?

PS2: I give cheers @tedherman for improvements to the 1st draft.

PS3: Optimistic replication is a neat survey of to a greater extent than decentralized replication protocols, their advantages, in addition to challenges.

Bonus Section: Paxos is a relatively centralized approach to distributed consensus

Consensus is commonly non an all-hands-on process. That tin hold upward difficult to scale. Consider our democratic system: It is pretty centralized; nosotros solely elect leaders to dominion for us.

In the same sense, you lot tin intend of Paxos every bit the to a greater extent than centralized approach to distributed consensus in addition to solid set down machine replication. In Paxos, the participants produce non interact amongst all other participants to create upward one's heed the lodge of requests to hold upward accepted, instead the leader dictates the lodge of requests in addition to the participants simply convey them. (A fully decentralized consensus algorithm would hold upward similar the synchronous rounds consensus algorithm where inwards every circular each player communicates amongst all other participants in addition to hence that they tin converge on the same state.)

0 Response to "Distributed Is Non Necessarily To A Greater Extent Than Scalable Than Centralized"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel