There’s a saying that permeates Silicon Valley that goes something like, “if you want to know how the rest of the world will run its infrastructure in 10 years, go look at what Google is doing today.” While this might be slight hyperbole, there’s no question that Google has already left an indelible imprint on traditional enterprise IT from commoditizing data storage and processing to ushering in the NoSQL era to making popular the warehouse-scale computing paradigm that dominates Webscale outfits like Facebook, Twitter and Uber.
Looking out across datacenters today, there’s a massive shift taking place driven by the rise of decentralized applications (which in turn has been driven by the massive scale and performance requirements predicated by mobile and cloud). As companies of all sizes transform into software organizations, they are running into the same set of challenges around agility, scale and uptime while trying to rein in costs. Not coincidently, these were all challenges faced — and more-less solved — by Google as it came to dominate the Web.
So, with the cloud-native transformation in full swing, we can begin to understand and appreciate the changes taking place up and down the stack for the rest of the world by examining how Google has architected its compute and data tiers and how it handles monitoring and management. Three Google papers in particular highlight the emergent architectural paradigms we’re seeing today and the motivations behind their rapid adoption.
Large-scale cluster management at Google with Borg — Verma et al. (Google) 2015
Google has been running on containerized infrastructure for nearly a decade and manages cluster resources, scheduling, orchestration and deployment through a proprietary system called Borg which provides three main benefits:
“…it (1) hides the details of resource management and failure handling so its users can focus on application development instead; (2) operates with very high reliability and availability, and supports applications that do the same; and (3) lets us run workloads across tens of thousands of machines effectively.”
Borg serves as the central nervous system for Google’s datacenters. Instead of standing up separate clusters for each workload or application — whether that be batch data processing jobs, stateful services or long-running services like Search or Gmail — Google devs deploy code to Borg which then distributes and schedules workloads and allocates the requisite CPU, memory, etc. resources. Borg enables Google to achieve greater levels of agility (through developer self-service and automation), fault tolerance, scalability and resource utilization — Borg is rumored to have saved Google the capex associated with the build out of at least two additional datacenters.
Today we’re seeing the adoption of several Borg-like distributed runtimes, with Apache Mesos, Google’s / CNCF’s Kubernetes and Docker’s Swarmleading the way. These platforms promise to streamline deployment, schedule and orchestrate containerized services and create a similar cross-datacenter autonomic computing fabric that Google has been running on for years.
Spanner: Google’s Globally-Distributed Database — Corbett et al. (Google) 2012
At Google, all services require distributed operation, must be highly available and handle machine failures transparently. It stands to reason that the stateful services backing these workloads are of proportional scale and also require distributed operation and must be similarly — if not more — resilient to outages.
Spanner is Google’s globally distributed, and synchronously-replicated database which backs nearly every application within Google. Spanner features horizontal scalability, automatic sharding and fault tolerance similar to its NoSQL predecessor BigTable. However, it’s Spanner’s consistent replication, support of transactional semantics, relational data model and SQL interface that sets it apart from many prevailing NoSQL data stores (DyanmoDB, Cassandra, Mongo, etc.).
The motivation for strong consistency and distributed transactions was increased developer productivity. Google’s learning with BigTable was that it’s ultimately easier for the database to handle transaction logic while devs can focus on optimizing performance at the application tier.
“Finally, the lack of cross-row transactions in Bigtable led to frequent complaints…We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.”
In the single-server, monolithic application or even a three-tier Web app, managing state was fairly straight forward. But now as apps become decentralized and decomposed into dozens or hundreds of services, stateful back-ends should be distributed and multi-tenant, in kind. Of course supporting a distributed, shared database requires a team of DBAs so more recently we’ve seen the emergence of solutions like AWS Aurora, a fully-managed scale-out RDBMS, and even more ambitious projects like CockroachDB (based on Google’s Spanner) which gives you out-of-the-box scalability and survivability with strong consistency, geo-replication and a SQL interface.
Dapper, a Large-Scale Distributed Systems Tracing Infrastructure — Sigelman et al. (Google) 2010
With applications being built as complex distributed systems composed of dozens, hundreds or thousands of discreet services, debugging, performance tuning, introspection and monitoring become major organizational challenges. The operational complexity associated with decentralized applications — daily/hourly deploys, polyglot code bases, ephemeral compute, multi-tenant infrastructure with multiple layers of abstraction — necessitates a new monitoring paradigm.
It’s this complexity which motivated Google to create Dapper, a production distributed systems tracing infrastructure which provides…
“…developers with information about the behavior of complex distributed systems…When systems involve not just dozens of subsystems but dozens of engineering teams, even [the] best and most experienced engineers routinely guess wrong about the root cause of poor end-to-end performance. In such situations, Dapper can furnish much-needed facts and is able to answer many important performance questions conclusively.”
The power of tracing lies in the ability introspect the entire data path — from client request to system call — at a given moment in time. In traditional monolithic systems debugging, monitoring latency, etc. are well understood, but once a production system contends with concurrency or divides into many services, these crucial (and formerly easy) tasks become complex. Given that, we’re now seeing a push for the emergence of tracing standards with the OpenTracing Project that will buoy the adoption of open source tracing projects like Zipkin and HTrace as well as powerful platforms like LightStep.
This post originally appeared on the blog Memory Leak by Lenny Pruss.