Cassandra High Performance Cookbook: Second Edition
|Also available on:|
- Understand the Column Family data model and other Big Data schema modelling techniques from practical real world examples
- Write applications to store and access data in Cassandra using both the RPC and Cassandra Query Language interfaces
- Deploy multi-node, multi-data center Cassandra clusters for high availability
Book DetailsLanguage : English
Paperback : 350 pages [ 235mm x 191mm ]
Release Date : June 2013
ISBN : 1782161805
ISBN 13 : 9781782161806
Author(s) : Edward Capriolo
Topics and Technologies : All Books, Cookbooks, Open Source
Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.
Sorry, there are currently no downloads available for this title.
What you will learn from this book
- Understand how Cassandra combines the features of Google's Big Table Column Family data model with Amazon Dynamo's fully distributed design
- Store and organize terabytes of data efficiently across hundreds of nodes in multiple geographic locations
- Monitor Cassandra clusters including detecting, tuning, and correcting common problems
- Access and modify data using the Cassandra Query Language as well as the thrift RPC API through Hector
- Carry out administrative tasks like joining new nodes to the cluster or performing snapshot backups
- Use features including time-to-live columns, counter columns, and composite columns, wide rows, and secondary indexes, and collections to efficiently store complex data
- Use Hadoop in combination with Cassandra andrun map reduce jobs that read from or write to Cassandra, including Hive jobs
- Use Cassandra with a variety of libraries including Hector Spring, Hadoop, Hive, Flume, and more
Across the planet data is growing rapidly and so is the need for scalable data storage systems. Apache Cassandra was designed from the ground up to be a fully distributed low latency system to store and access data on hundreds of physical servers and multiple geographic locations. Cassandra as a product is evolving fast. As a consequence, the Cassandra community and the list of customers continues to grow and so is the number of use cases.
"Cassandra High Performance Cookbook: Second Edition" covers more than 100 recipes on how to use Apache Cassandra in different scenarios. Recipes mainly include tasks on how to perform installation, to changing settings for optimal performance, to designing schema, integration, querying, and many more, so that massive amounts of data can be stored, retrieved, and analyzed quickly.
The book starts explaining how to quickly set up Cassandra. After learning the basics, we start developing an application that stores data in Cassandra. By the end of the book the user understands the key features of the database. The book's recipes show the user many important concepts and we start with schema design to store data in Cassandra. We then show how to interact with Cassandra using the command line interface and programs. We also cover performance tuning tips from the operating system level to the application level. The book describes how to administer large Cassandra clusters by adding and removing nodes and even entire data centres. We show third party libraries and applications that can be used with Cassandra, and we cover in depth performance analysis and monitoring techniques.
"Cassandra High Performance Cookbook: Second Edition" is a practical, hand-on-guide to working with Apache Cassandra, one of the most popular NoSQL data stores, showing you how to design scalable storage systems for low latency access of big data. This book comprises of more than 100 recipes to take you through intuitive recipes related to installation, configuration, and administration followed by advanced recipes on integration, performance analysis, and monitoring techniques, with CQL and Hector client.
Who this book is for
This book is designed for administrators, developers, and data architects who are interested in Apache Cassandra for redundant, highly performing, and scalable data storage. Typically these users should have experience working with a database technology, multiple node computer clusters, and high availability solutions.