Packt+ | Advance your knowledge in tech

You're reading from Fast Data Processing Systems with SMACK Stack

Product typeBook

Published inDec 2016

Reading LevelIntermediate

PublisherPackt

ISBN-139781786467201

Edition1st Edition

Languages

Scala

Tools

Mesos Apache Spark

Concepts

Data Processing

Author (1)

Raúl Estrada

Chapter 7. Study Case 1 - Spark and Cassandra

The three last chapters are study cases. In the first study case we discuss the relationship between Spark and Cassandra; in the second study case we explore the relationship among the other technologies; and in the last chapter we analyze the Mesos frameworks and containers.

Remember that in all the examples we use Scala as language and Akka as the actor model. Also Mesos is considered an infrastructure technology, so we assume that all the use cases can be deployed on Mesos and use Scala and Akka.

This chapter has the following parts:

Spark Cassandra connector
Study case: The Calliope project

Spark Cassandra connector

To use Apache Spark and Apache Cassandra together, we could develop the calls with our bare hands, but thanks to the open source community coordinated by the DataStax people we have the Spark Cassandra connector. If you remember the history, Cassandra was a project conceived on Facebook that became an Apache project and reached such a size that a whole company was created to support it: DataStax.

DataStax is the company responsible for Apache Cassandra's fate. DataStax has developed, among other useful tools, the Spark-Cassandra connector, which is a powerful open source library that hast three main directives:

Expose Cassandra tables as Spark RDDs.
Write Spark RDDs to Cassandra.
Execute CQL queries within Spark applications.

The Spark-Cassandra connector main features are:

Supports Apache Spark version 1.0 through 1.6
Supports Apache Cassandra version 2.0 or later
Supports Scala versions 2.10 and 2.11
Supports all the Cassandra data types including collections
Can convert...

Study case: The Calliope project

In Greek mythology, Calliope (/kəˈlaɪ.əpiː/ kə-ly-ə-pee; Ancient Greek: Καλλιόπη Kalliopē "beautiful-voiced") was the muse of epic poetry. Calliope was the daughter of Zeus and Mnemosyne, and is believed she was the muse of the poet Homer who inspired the Odyssey and the Iliad.

Calliope is the bridge between Cassandra and Spark that allows us to create fast real-time data apps with ease. Calliope is a library that provides an interface to consume Cassandra data into Spark and vice versa; and to store Spark Resilient Distributed Datasets into Cassandra. As we saw, we can use Spark on Cassandra without Calliope, but Calliope make it all easier.

Calliope was started by Tuplejump Inc in 2013, when there was no other solution available to work with Cassandra Data in Spark. In 2014 Tuplejump worked on the core stabilization while Calliope was adopted and deployed at many organizations.

Installing Calliope

To use the Calliope jar from the Spark shell, add this jar to...

Summary

In the case study in this chapter we have covered the connection between Spark and Cassandra.

We looked at the Spark Cassandra connector and how to make the Cassandra and Spark Context setup, Cassandra and Spark streaming, streaming context creation, reading and writing a stream from Cassandra, saving datasets, collections and tuples to Cassandra, modifying collections, saving UDTs and RDDs as tables.

We also reviewed the Calliope project: installing Calliope, reading and writing from Cassandra with CQL3 and writing and reading from Cassandra with Thrift.

This chapter was about the relation between Spark and Cassandra. In the next chapter we will examine the relationship between the remaining SMACK technologies.

The rest of the chapter is locked

You have been reading a chapter from

Fast Data Processing Systems with SMACK Stack

Published in: Dec 2016Publisher: PacktISBN-13: 9781786467201

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages