PNUTS: Building and running a cloud database system

Brian Cooper, Yahoo!

Abstract:

Brian Cooper

I’ll describe PNUTS, a system we have built at Yahoo! for managing web-scale data. PNUTS is focused on serving systems (low-latency data management to support online web applications) and is complementary to (but different from) our cloud analytical system, Hadoop. I’ll describe the architecture of PNUTS, which has all the (now) standard cloud features of distribution, elastic scalability and failover. I’ll also describe features unique to PNUTS among cloud systems, such as consistency models that go beyond eventual consistency and its worldwide geographic replication. PNUTS is both a research project and a production system, and I’ll describe some of the research we are doing on “next generation” features as well as lessons learned trying to put a research system into production.

Bio:

I am a research scientist at Yahoo! Research. Before that, I was an assistant professor at Georgia Tech, and before that, I was a Ph.D. student at Stanford. My interests are in building distributed systems, and in particular, distributed systems that do database-style management and processing of data. At Yahoo! I work on building very large distributed data storage and processing systems. In previous lives, I have worked on self-adaptive peer-to-peer systems, distributed streaming event processing, reliable distributed archival data storage, and XML indexing.