Scaling Memcache At Facebook

This paper, which appeared inwards NSDI 2013, describes the development of Facebook’s memcached-based architecture. Memcached is an open-source implementation of a distributed in-memory hash tabular array over a cluster of computers, as well as it is used for providing depression latency access to a shared storage puddle at depression cost. Memcached was get-go developed past times Brad Fitzpatrick inwards 2003.

Memcached at Facebook targets a workload dominated past times reads (which are 2 orders of magnitude to a greater extent than frequent than writes). Facebook needs to back upwards a really heavy read load, over 1 billion reads/second, as well as needs to insulate backend services from high read rates amongst very high fan-out. To this end, Facebook uses memcache every bit a "demand-filled look-aside cache". When a spider web server needs data, it get-go requests the value from memcache past times providing a string key. If it is a cache hit, great, the read functioning is served. If the item addressed past times that telephone substitution is non cached, the spider web server retrieves the information from the backend database as well as populates the cache amongst the key-value duo (for the produce goodness of the upcoming reads).

For write requests, the spider web server issues SQL statements to the database as well as so sends a delete asking to memcache to invalidate the stale data, if any. Facebook chooses to delete cached information instead of updating it inwards cache because deletes are idempotent (which is a squeamish holding to possess got inwards distributed systems). This is security because memcache is non the authoritative rootage of the information as well as is so allowed to evict cached data.

In a Cluster: Reducing Latency as well as Load

Loading a pop page inwards Facebook results inwards fetching from memcache an average of 521 distinct items (a really high fan-out indeed), which possess got been distributed across the memcached servers through consistent hashing. All spider web servers communicate amongst every memcached server inwards a curt menstruation of time, as well as this all-to-all communication pattern tin exertion incast congestion or exertion a unmarried server to larn the bottleneck for many spider web servers.

Memcached servers produce non communicate amongst each other. When appropriate, Facebook embeds the complexity of the organization into a stateless customer rather than inwards the memcached servers. Client logic is provided every bit 2 components: a library that tin live embedded into applications or every bit a standalone proxy named mcrouter. Clients role UDP as well as TCP to communicate amongst memcached servers. These client-centric optimizations cut incast-congestion at the servers (via application-specific UDP congestion control) as well as cut charge on the servers (via role of lease-tokens[like cookies] past times the clients). The details are inwards the paper.

For treatment failures of memcached nodes, Facebook uses the redundancy/slack idea. While Facebook relies on an automated remediation organization for dealing amongst node failures, this tin accept upwards to a few minutes. This duration is long plenty to exertion cascading failures as well as thus Facebook uses a redundancy/slack machinery to farther insulate backend services from failures. Facebook dedicates a modest prepare of machines, named Gutter, to accept over the responsibilities of a few failed servers. Gutter accounts for around 1% of the memcached servers inwards a cluster. When a memcached customer receives no answer to its larn request, the customer assumes the server has failed as well as issues the asking i time again to a exceptional Gutter pool. If this 2nd asking misses, the customer volition insert the appropriate key-value duo into the Gutter machine later on querying the database.

Note that this pattern differs from an approach inwards which a customer rehashes keys amid the remaining memcached servers. Such an approach risks cascading failures due to non-uniform telephone substitution access frequency: a unmarried telephone substitution tin concern human relationship for 20% of a server’s requests. The server that becomes responsible for this hot telephone substitution mightiness likewise larn overloaded. By shunting charge to idle servers Facebook limits that risk. In practice, this organization reduces the charge per unit of measurement of client-visible failures past times 99% as well as converts 10%–25% of failures into hits each day. If a memcached server fails entirely, hitting rates inwards the gutter puddle mostly transcend 35% inwards nether iv minutes as well as oft approach 50%. Thus when a few memcached servers are unavailable due to failure or tike network incidents, Gutter protects the backing shop from a surge of traffic.

In a Region: Replication

Naively scaling-out this memcached organization does non eliminate all problems. Highly requested items volition larn to a greater extent than pop every bit to a greater extent than spider web servers are added to grapple amongst increased user traffic. Incast congestion likewise worsens every bit the number of memcached servers increases. They so separate their spider web as well as memcached servers into multiple "regions". This part architecture likewise allows for smaller failure domains as well as a tractable network configuration. In other words, Facebook trades replication of information for to a greater extent than independent failure domains, tractable network configuration, as well as a reduction of incast congestion.

Across Regions: Consistency

When scaling across multiple regions, maintaining consistency betwixt information inwards memcache as well as the persistent storage becomes the primary technical challenge. These challenges stalk from a unmarried problem: replica databases may lag behind the original database (this is a per-record original every bit inwards PNUTS). Facebook provides best-effort eventual consistency but house an emphasis on performance as well as availability.

Facebook designates i part to concur the original databases as well as the other regions to incorporate read-only replicas, as well as relies on MySQL's replication machinery to proceed replica databases up-to-date amongst their masters. In this design, spider web servers tin nevertheless sense depression latency when accessing either the local memcached servers or the local database replicas.

Concluding remarks

Here are the lessons Facebook draws from their sense amongst the memcached
system. 1) Separating cache as well as persistent storage systems via memcached allows for independently scaling them. 2) Managing stateful components is operationally to a greater extent than complex than stateless ones. As a number keeping logic inwards a stateless customer helps iterate on features as well as minimize disruption. 3) Finally Facebook treats the probability of reading transient stale information every bit a parameter to live tuned, as well as is willing to unwrap slightly stale information inwards substitution for insulating a backend storage service from excessive load.

Related links
AWS seems to offering elasticache, which is protocol-compliant amongst Memcached.
There is likewise which I had summarized before here.