Learning Cassandra for Administrators

Learning Cassandra for Administrators
eBook: $17.99
Formats: PDF, PacktLib, ePub and Mobi formats
save 15%!
Print + free eBook + free PacktLib access to the book: $47.98    Print cover: $29.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Table of Contents
Sample Chapters
  • Install and set up a multi datacenter Cassandra
  • Troubleshoot and tune Cassandra
  • Covers CAP tradeoffs, physical/hardware limitations, and helps you understand the magic
  • Tune your kernel, JVM, to maximize the performance
  • Includes security, monitoring metrics, Hadoop configuration, and query tracing

Book Details

Language : English
Paperback : 120 pages [ 235mm x 191mm ]
Release Date : November 2013
ISBN : 1782168176
ISBN 13 : 9781782168171
Author(s) : Vijay Parthasarathy
Topics and Technologies : All Books, Web Development, Open Source

Table of Contents

Chapter 1: Basic Concepts and Architecture
Chapter 2: Installing Cassandra
Chapter 3: Inserting Data and Manipulating Data
Chapter 4: Administration and Large Deployments
Chapter 5: Performance Tuning
Chapter 6: Analytics
Chapter 7: Security and Troubleshooting
  • Chapter 1: Basic Concepts and Architecture
    • CAP theorem
    • BigTable / Log-structured data model
      • Column families
      • Keyspace
      • Sorted String Table (SSTable)
      • Memtable
      • Compaction
    • Partitioning and replication Dynamo style
      • Gossip protocol
      • Distributed hash table
      • Eventual consistency
    • Summary
    • Chapter 2: Installing Cassandra
      • Memory, CPU, and network requirements
      • Cassandra in-memory data structures
        • Index summary
        • Bloom filter
        • Compression metadata
        • SSDs versus spinning disks
        • Key cache
        • Row cache
      • Downloading/choosing binaries to install
        • Configuring cassandra-env.sh
        • Configuring Cassandra.yaml
          • cluster_name
          • seed_provider
          • Partitioner
          • auto_bootstrap
          • broadcast_address
          • commitlog_directory
          • data_file_directories
          • disk_failure_policy
          • initial_token
          • listen_address/rpc_address
          • Ports
          • endpoint_snitch
          • commitlog_sync
          • commitlog_segment_size_in_mb
          • commitlog_total_space_in_mb
          • Key cache and row cache saved to disk
          • compaction_preheat_key_cache
          • row_cache_provider
          • column_index_size_in_kb
          • compaction_throughput_mb_per_sec
          • in_memory_compaction_limit_in_mb
          • concurrent_compactors
          • populate_io_cache_on_flush
          • concurrent_reads
          • concurrent_writes
          • flush_largest_memtables_at
          • index_interval
          • memtable_total_space_in_mb
          • memtable_flush_queue_size
          • memtable_flush_writers
          • stream_throughput_outbound_megabits_per_sec
          • request_scheduler
          • request_scheduler_options
          • rpc_keepalive
          • rpc_server_type
          • thrift_framed_transport_size_in_mb
          • rpc_max_threads
          • rpc_min_threads
          • Timeouts
        • Dynamic snitch
        • Backup configurations
          • incremental_backups
          • auto_snapshot
      • Cassandra on EC2 instance
        • Snitch
      • Create a keyspace
        • Creating a column family
          • GC grace period
          • Compaction
          • Minimum and maximum compaction threshold
        • Secondary indexes
        • Composite primary key type
          • Options
        • read_repair_chance and dclocal_read_repair_chance
      • Summary
      • Chapter 3: Inserting Data and Manipulating Data
        • Querying data
          • USE
          • CREATE
          • ALTER
          • DESCRIBE
          • SELECT
        • Tracing
        • Data modeling
          • Types of columns
          • Common Cassandra data models
            • Denormalization
            • Creating a counter column family
            • Tweet data structure
            • Secondary index examples
        • Summary
          • Chapter 5: Performance Tuning
            • vmstat
            • iostat
            • dstat
            • Garbage collection
              • Enabling GC logging
              • Understanding GCLogs
                • Stop-the-world GC
                • The jstat tool
                • The jmap tool
              • The write surveillance mode
            • Tuning memtables
              • memtable_flush_writers
            • Compaction tuning
              • SizeTieredCompactionStrategy
              • LeveledCompactionStrategy
            • Compression
              • NodeTool
              • compactionstats
              • netstats
              • tpstats
              • Cassandra's caches
                • Filesystem caches
              • Separate drive for commit logs
              • Tuning the kernel for Cassandra
              • noop scheduler
              • NUMA
              • Other tuning parameters
              • Dynamic snitch
              • Configuring a Cassandra multiregion cluster
            • Summary
            • Chapter 6: Analytics
              • Hadoop integration
                • Configuring Hadoop with Cassandra
                  • Virtual datacenter
                • Acunu Analytics
                • Reading data directly from Cassandra
                • Analytics on backups
                  • File streaming
              • Summary
              • Chapter 7: Security and Troubleshooting
                • Encryption
                  • Creating a keystore
                  • Creating a truststore
                  • Transparent data encryption
                    • Keyspace authentication (simple authenticator)
                    • JMX authentication
                • Audit
                • Things to look out for
                • Summary

                Vijay Parthasarathy

                Vijay Parthasarathy is an Apache Cassandra Committer who has helped multiple companies use Cassandra successfully; most notably, he was instrumental in Netflix's move into Cassandra. Vijay has multiple years of experience in software engineering and managing large project teams. He has also successfully architected, designed, and developed multiple large-scale distributed computing systems, distributed databases, and highly concurrent systems.
                Sorry, we don't have any reviews for this title yet.

                Submit Errata

                Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.

                Sample chapters

                You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                Frequently bought together

                Learning Cassandra for Administrators +    Socket.IO Real-time Web Application Development =
                50% Off
                the second eBook
                Price for both: €19.75

                Buy both these recommended eBooks together and get 50% off the cheapest eBook.

                What you will learn from this book

                • Explore trade-offs and basic concepts
                • Install Cassandra, choose hardware, and configure the cluster
                • Query and insert data and CQL
                • Get to grips with performance tuning
                • Find out about Hadoop integration and evolving apps
                • Discover anti-patterns and how to secure your cluster

                In Detail

                Apache Cassandra is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers linear scalability and performance across many commodity servers with no single point of failure.

                This book starts by explaining how to derive the solution, basic concepts, and CAP theorem. You will learn how to install and configure a Cassandra cluster as well as tune the cluster for performance. After reading the book, you should be able to understand why the system works in a particular way, and you will also be able to find patterns (and/or use cases) and anti-patterns which would potentially cause performance degradation. Furthermore, the book explains how to configure Hadoop, vnodes, multi-DC clusters, enabling trace, enabling various security features, and querying data from Cassandra.

                Starting with explaining about the trade-offs, we gradually learn about setting up and configuring high performance clusters. This book will help the administrators understand the system better by understanding various components in Cassandra’s architecture and hence be more productive in operating the cluster. This book talks about the use cases and problems, anti-patterns, and potential practical solutions as opposed to raw techniques. You will learn about kernel and JVM tuning parameters that can be adjusted to get the maximum use out of system resources.


                This book is a practical, hands-on guide, taking the reader from the basics of using Cassandra though to the installation and the running.

                Who this book is for

                Learning Cassandra for Administrators is for administrators who manage a large deployment of Cassandra clusters, and support engineers who would like to install the monitoring tools and who are also in charge of making sure the cluster stays the same, ensuring that the service is always up and running.

                Code Download and Errata
                Packt Anytime, Anywhere
                Register Books
                Print Upgrades
                eBook Downloads
                Video Support
                Contact Us
                Awards Voting Nominations Previous Winners
                Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software