Strange Loop 2010 -

Adopting Apache Cassandra

11:00 - 11:50am on Thursday, October 14 2010 in Pageant

The Cassandra database is distributed, highly-available, fault-tolerant, and offers an elastic scaling model—all of which make Cassandra a powerful proposition for mission-critical applications. It’s used by many of the world’s biggest web properties, including Facebook, Twitter, Digg, StumbleUpon, Reddit, Cisco, and others.

This is all fantastic, but there’s no free lunch—Cassandra is not a relational database, but rather follows in the footsteps of columnar data stores such as Google BigTable and Amazon’s Dynamo. As such, getting your head around how Cassandra works can be daunting to say the least: there’s a lot of new terminology (what’s a Hinted Handoff? What’s a SuperColumn?? What do I need to know about Vector Clocks??? Argh!). There are some complex algorithms in Cassandra, and new ways of handling basic operations in order to achieve the benefits mentioned above. Cassandra only recently emerged from Incubator status, and there aren’t a lot of tools available yet to smooth your path toward adoption. This talk can help you understand everything you need to know to get started using Cassandra. We’ll sort out all the terminology and foundational concepts, and then dive into a practical set of ways to get started putting Cassandra to work in your applications today.