ZooKeeper: Because Distributed Computing is a Zoo

Benjamin Reed, Research Scientist, Yahoo!

Video of lecture

Abstract

Benjamin Reed

Large distributed systems, aka Cloud Computing, are a Zoo: machines being added and removed, configuration changes, usage spikes, and load balancing; distributed applications must also handle failures such as network partitions and crashes. We developed ZooKeeper to help the distributed application developer build applications in this chaotic environment. It provides a simple abstraction for coordination that developers can use extensively to handle changes and failures. In our experience, applications that use ZooKeeper are not only more robust but also easier to develop than those using ad-hoc solutions. This talk will motivate the design of ZooKeeper, show how it is used, and share insights gained from its use in production.

Bio

Dr. Benjamin Reed has worked for two decades in industry. He started as an intern working on CAD/CAM systems. From there his career led him to Shipping and Receiving applications in OS/2, AIX, and CICS, to Operations, to System Admin Research and Java Frameworks at IBM Almaden Research (11 years), and finally to Yahoo! Research (3 years ago) working on the largest Distributed Computing Problems. His main interests now are large scale processing environments and highly available and scalable systems. He received his PhD from University of California, Santa Cruz.