Azure Cosmos Db: Microsoft's Cloud-Born Globally Distributed Database

It has been almost ix months since I started my sabbatical work amongst the Microsoft Azure Cosmos DB team. 

I knew what I signed upwards for then, as well as I knew it was overwhelming.
It is difficult non to acquire overwhelmed. Cosmos DB provides a global highly-available low-latency all-in-one database/storage/querying/analytics service to heavyweight demanding businesses. Cosmos DB is used ubiquitously within Microsoft systems/services, as well as is besides ane of the fastest-growing services used yesteryear Azure developers externally. It manages 100s of petabytes of indexed data, as well as serves 100s of trillions of requests every 24-hour interval from thousands of customers worldwide, as well as enables customers to construct highly-responsive mission-critical applications.
But I underestimated how much at that topographic point is to larn about, as well as how long it would live to educate a skilful feel of the large picture. By "developing a skilful feel of the large picture", I hateful learning/internalizing the territory myself, and, when looking at the terrain, existence able to appreciate the excruciatingly tiny as well as laborious details inwards each square-inch equally well.

Cosmos DB has many large teams working on many dissimilar parts/fronts at  whatever point. I am all the same amazed yesteryear the activity going on simultaneously inwards so many dissimilar fronts. Testing, security, inwardness development, resources governance, multitenancy, inquiry as well as indexing, scale-out storage, scale-out computing, Azure integration, serverless functions, telemetry, client support, devops, etc. Initially all I cared for was the inwardness evolution forepart as well as the concurrency command protocols there. But overtime via diffusion as well as osmosis I came to learn, understand, as well as appreciate the challenges inwards other fronts equally well.

It may audio weird to say this after ix months, but I am all the same a beginner. Maybe I withdraw hold been slow, but when you lot bring together a real large project, it is normal that you lot outset working on something small, as well as on a parallel thread you lot gradually educate a feel of an emerging large flick through your peripheral vision. This may withdraw hold many months, as well as inwards Microsoft large teams are aware of this, as well as give you lot your infinite equally you lot bloom.

This is a lot similar learning to drive. You outset inwards the parking lots, inwards less crowded suburbs, as well as and so equally you lot acquire to a greater extent than skilled you lot decease into the highway networks as well as acquire a fuller persuasion of the territory. It is overnice to explore, but you lot wouldn't live able to exercise it until you lot educate your skills. Even when someone shows you lot the map persuasion as well as tell you lot that these exists, you lot don't fully realize as well as appreciate them. They are only concepts to you, as well as you lot withdraw hold a real superficial familiarity amongst them until you lot outset to explore them yourself.

As a result, things seem to motion tardily if you lot are an engineer working on a small-scale thing, say datacenter automatic failover component, inwards ane component of the territory. Getting this constituent right may withdraw hold months of your time. But it is OK. You are edifice a novel street inwards ane of the suburbs, as well as this adds upwards to the large picture. The large flick keeps growing at a steady rate, as well as the colors acquire to a greater extent than vibrant, fifty-fifty though individually you lot experience similar you lot are progressing slowly.

"A one one thousand details add together upwards to ane impression." -Cary Grant 
"The implication is that few (if any) details are individually essential, piece the details collectively are absolutely essential." -McPhee

When I started inwards August, I had the suspicion that I needed to unpack this term: "cloud-native".
(The term "cloud-native" is a loaded cardinal term, as well as the squad doesn't work it lightly. I volition assay to unpack some of it here, as well as I volition revisit this inwards my after posts.)
Again, I underestimated how long it would withdraw hold me to acquire a amend appreciation of this. After learning close the many fronts things are progressing, I was able to acquire a amend agreement of how of import as well as challenging this is.

So inwards the residue of the post, I volition assay to give an overview of things I learned close what a cloud-native distributed database service does. It is possible to write a split upwards weblog ship for each paragraph, as well as I promise to expand on each inwards the future.

What does a cloud-born distributed database await like?

On the quest to accomplish cost-effective, reliable as well as assured, general-purpose as well as customizable/flexible operation, a globally distributed cloud database faces many challenges as well as mutually conflicting goals inwards dissimilar dimensions. But it is of import to hitting equally high a grade inwards equally many of the dimensions equally possible.

While several databases hitting ane or 2 of these dimensions,  hitting all these dimensions together is real challenging.  For example, the scale challenge becomes much harder when the database needs to supply guaranteed performance at whatever indicate inwards the scale curve. Providing virtually unlimited storage becomes to a greater extent than challenging when the database besides needs to reckon stringent performance SLAs at whatever point. Finally, achieving these piece providing tenant-isolation as well as managing resources to preclude whatever tenant from impacting the performance of others is extra challenging, but is required for providing as well as cost-efficient cloud database.

Trying to add together back upwards as well as realize substantial improvements for all of these dimensions does non piece of work equally an afterthought. Cosmos DB is able to hitting a high grade inwards all these dimensions because it is designed from the Earth upwards to reckon all these challenges together. Cosmos DB provides a frictionless cloud-native distributed database service via:

  • Global distribution (with guaranteed consistency as well as latency) yesteryear virtue of transparent multi-master replication,
  • Elastic scalability of throughput as well as storage worldwide (with guaranteed SLAs) yesteryear virtue of horizontal partitioning, and
  • Fine grained multi-tenancy yesteryear virtue of highly resource-governed organisation stack all the means from the database engine to the replication protocol.

To realize these, Cosmos DB uses a novel nested distributed replication protocol, robust scalability techniques, as well as well-designed resources governance abstractions, as well as I volition assay to innovate these next.

Scalability, Global Distribution, as well as Resource Governance


To accomplish high-scalability as well as availability, Cosmos DB uses a novel protocol to replicate information across nodes as well as datacenters amongst minimal latency overheads as well as maximum throughput. For scalability, information is automatically sharded guided yesteryear storage as well as throughput triggers, as well as the shards are assigned to dissimilar partitions. Each segmentation is served yesteryear a multiple node replica-set inwards each region, amongst ane node acting equally a primary replica to grip all write operations as well as replication within the replicaset. Reading the information is carried out from a quorum of secondary replicas yesteryear the clients. (Soft-state scale out storage as well as computation implementations supply independent scaling for storage-heavy as well as computation-heavy workloads.)

Since the replica-set maintains multiple copies of the data, it tin mask some replica failures inwards a part without sacrificing the availability fifty-fifty inwards the potent consistency mode. (While 2 simultaneous replica failure may live rare, it is to a greater extent than mutual to notice ane replica failure piece some other replica is unavailable due to rolling upgrades, as well as masking 2 replica failures tin assistance for shine uninterrupted performance for the cloud database.) Each part contains an independent configuration managing director to maintain organisation configuration as well as perform leader election for the partitions. Based on the membership changes, the replication protocol besides reconfigures the size of read as well as write quorums.


In add-on to the local replication within a replica-set, at that topographic point is besides geo-replication which implements distribution across any disclose of Azure regions ---50+ of them. Geo-replication is achieved yesteryear a nested consensus distribution protocol across the replica-sets inwards dissimilar regions. Cosmos DB provides multimaster active-active replication inwards guild to let reads as well as writes from whatever region. For most consistency models, when a write originates inwards some region, it becomes right away available inwards that part piece existence sent to an arbiter (ARB) for ordering as well as conflict resolution. The ARB is a virtual procedure that tin co-locate inwards whatever of the regions. It uses the distribution protocol to re-create the information to primaries inwards each region, which as well as so replicate the information inwards their respective regions. The Cosmos DB distribution as well as replication protocols are verified at the pattern bird amongst TLA+ model checker as well as the implementations are farther tested for consistency problems using (an MS Windows port of) Jepsen tests.


Cosmos DB is engineered from the Earth upwards amongst resources governance mechanisms inwards guild to supply an isolated provisioned throughput experience (backed upwards yesteryear SLAs), piece achieving high density packing (where 100s of tenants portion the same machine as well as 1000s portion the same cluster). To this end, Cosmos DB define an abstract rate-based currency for throughput, called Request Unit or RU (plural, RUs) that supply a normalized model for accounting the resources consumed yesteryear a request, as well as accuse the customers for throughput across various database operations consistently as well as inwards a hardware agnostic manner. Each RU combines a small-scale portion of CPU, retention as well as storage IOPS. Tenants inwards Cosmos DB command the desired performance they demand from their containers yesteryear specifying the maximum throughput RUs for a given container. Viewed from the lens of resources governance, Cosmos DB is a massively distributed queuing organisation amongst cascaded stages of components, each carefully calibrated to deliver predictable throughput piece operating within the allotted budget of organisation resources, guided yesteryear the principles of Little's Law as well as Universal Scalability Law. The implementation of resources governance leverages on the backend partition-level capacity management as well as across-machine charge balancing.

Contributions of Cosmos DB

By leveraging on the cloud-native datastore foundations above, Cosmos DB provides an almost "one size fits" database. Cosmos DB provides local access to information amongst depression latency as well as high throughput, offers tunable consistency guarantees within as well as across datacenters, as well as provides a broad diverseness of information models as well as APIs leveraging powerful indexing as well as querying support. Since Cosmos DB supports a spectrum of consistency guarantees as well as industrial plant amongst a various prepare of backends, it resolves the integration problems companies human face upwards when multiple teams work dissimilar databases to reckon dissimilar use-cases.

Vast bulk of other data-stores supply either eventual, potent consistency, or both. In contrast, the spectrum of consistency guarantees Cosmos DB provides meets the consistency as well as availability needs of all venture applications, spider web applications, as well as mobile apps. Cosmos DB enables applications create upwards one's hear what is best for them as well as to brand dissimilar consistency-availability tradeoffs. An eventually-consistent store allows diverging soil for a menses of fourth dimension inwards central for to a greater extent than availability as well as lower latency as well as perhaps suitable for relaxed consistency applications, such equally recommendation systems or search engines. On the other hand, a shopping cart application may require a "read your ain writes" holding for right operation. This is served yesteryear the session consistency guarantee at Cosmos DB. Extending this across many clients requires provides "read latest writes" semantics as well as requires potent consistency. This ensures coordination amidst many clients of the same application as well as makes the task of the application developer easy. The potent consistency is preserved both within a unmarried part equally good equally across all associated regions. The bounded staleness consistency model guarantees whatever read asking returns a value within the most recent k versions or t time. It offers global total guild except within the staleness window, thence it is a slightly weaker guarantee than the potent consistency.


Another means Cosmos DB manages to decease an all-in-one database is yesteryear providing multiple APIs as well as serving dissimilar information models. Web as well as mobile applications demand a spectrum of choices/alternatives for their information models: some applications withdraw hold wages of simpler models, such equally key-value or column-family, as well as yet some other applications require to a greater extent than specialized information models, such equally document as well as graph stores. Cosmos DB is designed to back upwards APIs as well as information models yesteryear projecting the internal document store into dissimilar representations depending on the selected model. Clients tin pick betwixt document, SQL, Azure Table, Cassandra, as well as Gremlin graph APIs to interact amongst their datastore. This non exclusively provides keen flexibility for our clients, but besides allows them to effortlessly migrate their applications over to Cosmos DB.

Finally, Cosmos DB provides the tightest SLAs inwards the industry. In contrast to many other databases that supply exclusively availability SLAs, Cosmos DB besides provides  consistency SLAs, latency SLAs, as well as throughput SLAs. Cosmos DB guarantees the read as well as write operations to withdraw hold nether 10ms at the 99th percentile for a typical 1KB object. The SLAs for throughput guarantee that the clients to have the throughput equivalent to the resources provisioned to the occupation organisation human relationship via RUs.

0 Response to "Azure Cosmos Db: Microsoft's Cloud-Born Globally Distributed Database"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel