Packt+ | Advance your knowledge in tech

You're reading from Seven NoSQL Databases in a Week

Product typeBook

Published inMar 2018

PublisherPackt

ISBN-139781787288867

Edition1st Edition

Tools

MongoDB Cassandra

Concepts

Database Programming

Authors (2):

Sudarshan Kadambi

Xun (Brian) Wu

View More author details

Row versus column versus column-family storage models

When you have a logical table with a bunch of rows and columns, there are multiple ways in which they can be stored physically on a disk.

You can store the contents of entire rows together so that all of the columns of a given row would be stored together. This works really well if the access pattern accesses a lot of the columns for a given set of rows. MySQL uses such a row-oriented storage model.

On the other hand, you could store the contents of entire columns together. In this scheme, all of the values from all of the rows for a given column can be stored together. This is really optimized for analytic use cases where you might need to scan through the entire table for a small set of columns. Storing data as column vectors allows for better compression (since there is less entropy between values within a column than there is between the values across a column). Also, these column vectors can be retrieved from a disk and processed quickly in a vectorized fashion through the SIMD capabilities of modern processors. SIMD processing on column vectors can approach throughputs of a billion data points/sec on a personal laptop.

Hybrid schemes are possible as well. Rather than storing an entire column vector together, it is possible to first break up all of the rows in a table into distinct row groups, and then, within a row group, you could store all of the column vectors together. Parquet and ORC use such a data placement strategy.

Another variant is that data is stored row-wise, but the rows are divided into row groups such that a row group is assigned to a shard. Within a row group, groups of columns that are often queried together, called column families, are then stored physically together on the disk. This storage model is used by HBase and is discussed in more detail in Chapter 6, HBase.

You have been reading a chapter from

Seven NoSQL Databases in a Week

Published in: Mar 2018Publisher: PacktISBN-13: 9781787288867

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Sudarshan Kadambi

Sudarshan has a background in Distributed systems and Database design. He has been a user and contributor to various NoSQL databases and is passionate about solving large-scale data management challenges.
Read more about Sudarshan Kadambi

Xun (Brian) Wu

Xun (Brian) Wu is a senior blockchain architect and consultant. With over 20 years of hands-on experience across various technologies, including Blockchain, big data, cloud, AI, systems, and infrastructure, Brian has worked on more than 50 projects in his career. He has authored nine books, which have been published by O'Reilly, Packt, and Apress, focusing on popular fields within the Blockchain industry. The titles of his books include: Learn Ethereum (First Edition), Learn Ethereum (Second Edition), Blockchain for Teens, Hands-On Smart Contract Development with Hyperledger Fabric V2, Hyperledger Cookbook, Blockchain Quick Start Guide, Security Tokens and Stablecoins Quick Start Guide, Blockchain by Example, and Seven NoSQL Databases in a Week.
Read more about Xun (Brian) Wu

Other recommended products

Related to this chapter

Mastering Apache Cassandra 3.x

This practical guide explains you to program and understand the power of Apache Cassandra 3.x. You will explore the integration and interaction of Cassandra components, and explore features such as the token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail.

BookOct 2018348 pages

Redis 4.x Cookbook

Redis is a popular key-value store database used commonly across many enterprises. Based on the latest version of Redis 4.x, this book provides useful recipes to help you overcome any obstacle when it comes to the different tasks associated with Redis - from working with data types to administering and troubleshooting your Redis solution.

BookFeb 2018382 pages

Amazon Web Services Bootcamp

AWS Bootcamp is designed to teach you how to build and manage AWS resources using different ways. This highly practical guide leverages the reliability, versatility, and flexible design of the AWS Cloud. It enables you to perform tasks such as hosting multi-tier websites, running large-scale applications, data storage and archival, and a lot more with ease.

BookMar 2018338 pages

Learning Neo4j 3.x

With increase in complexity of data relationships, graph databases are quickly becoming the de-facto standard for organizations who manage large volumes of connected data. This book aims at getting you started with the popular graph database Neo4j along with covering key concepts like modelling transitions, searches, traversals, relationships and protocols to navigate through complex networks of information. Also take a trip down the new and improved feature additions to version 3.x such as the APOC library, security, various plugins and extensions for spatial operations on data.

BookOct 2017316 pages

Learning Apache Cassandra

Apache Cassandra is second generation distributed NoSQL database and a popular choice for enterprises across the globe for it scalable and customizable features. This book offers you a steady learning path to understand its capabilities and develop skills to build highly reliable big data applications. This edition comes with examples to implement the new and improved features of version 3.x along with covering topics like data design considerations, tuning consistency, elastic scalability, query performance and optimizations. You’ll have gained all the skills required to become a proficient developer ready to design, create and deliver applications for organizations.

BookApr 2017360 pages

HBase High Performance Cookbook

BookJan 2017350 pages

MongoDB Fundamentals

MongoDB Fundamentals will get you started using MongoDB for data processing in a cloud computing environment. Starting with the fundamentals of NoSQL, you'll build up to learning advanced data manipulation techniques and application development with the help of hands-on case-studies.

BookDec 2020748 pages

Mastering MongoDB 3.x

MongoDB has gone from being a niche database to the king of NoSQL databases in a short time and this is no small feat. Mastering MongoDB will help you gain proficiency in developing apps using MongoDB. This book covers a range of topics such as CRUD operations, Indexing, aggregation, monitoring, sharding, cluster operations, and more. If you are a developer, architect, or DBA using MongoDB and want to be more productive when designing and administering MongoDB-backed applications, then this book can take you there in the minimum time.

BookNov 2017342 pages

Mastering MongoDB 4.x

This book will help you build expert proficiency in developing large-scale applications using MongoDB 4.x. You will master CRUD operations and perform tasks such as indexing, aggregation, monitoring, sharding, cluster management, and administration. You take building and administering scalable MongoDB applications to the next level.

BookMar 2019394 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages