Reader small image

You're reading from  Fast Data Processing Systems with SMACK Stack

Product typeBook
Published inDec 2016
Reading LevelIntermediate
PublisherPackt
ISBN-139781786467201
Edition1st Edition
Languages
Right arrow
Author (1)
Raúl Estrada
Raúl Estrada
author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada

Right arrow

Chapter 2.  The Model - Scala and Akka

This chapter is divided into two parts: Scala (the language) and Akka (the actor model implementation for the JVM).

As this book is about architecture, and Spark is built in Scala following an actor's model, in this book we decided to show examples only using Scala as the language. In this way, we have made room for architectural issues preventing lead content.

In the Apache Spark world, there are four spoken languages: Java, Scala, Python, and R. To continue with our training, we need to know one of these four languages. Most books expose all examples in each of these languages.

If you are reading this section and do not know Scala, welcome to the introduction course for data manipulation. This chapter is a dojo where you will learn some Scala tricks to manipulate data (because it is not a book about Scala, some powerful topics were not mentioned, such as null-less containers, for example, option, either, try, pattern matching, and case classes). It...

The language - Scala


The objective of this section is to think in a functional programming way.

As good data architects, here we will understand collections. We will not cover other issues of the language other than collection management.

We need to be clear regarding the following two statements:

  • Scala collections are different from Java collections
  • Scala collections are different from Spark collections

So, a list in Java is different from a list in Scala. Lists are a fundamental part of functional languages. The first functional programming language, LISP, is an acronym for List Processing.

We have to master three key concepts of functional programming to understand Scala collections:

  • Predicates
  • Literal functions (anonymous functions)
  • Implicit loops

A predicate is just a function that receives several parameters and returns a Boolean value.

For example:

def isOdd (i: Int) = if (i % 2 != 0) true else false 

A literal function is an alternate syntax for defining a function. It's useful when we want...

The model - Akka


The objective of this section is to think about our systems in the Actor Model.

The Actor Model is a mathematical model. As Obi Wan would say, it's "An elegant weapon for a more civilized age." The Actor Model was developed by Carl Hewitt, Peter Bishop, and Richard Steiger in 1973 at the Massachusetts Institute of Technology, in a paper entitled, A Universal Modular Actor Formalism for Artificial Intelligence.

It was a more civilized age, because computer science was developed by mathematicians and all the programming was made with their bare hands. Well, if the Actor Model has been around for more than 40 years, at what point did we turn to the dark side? The answer is neither short nor simple to explain.

The quick and dirty answer is: because they developed a very advanced model for the technology of those days. The problem is that we had to develop a lot of technology in software and hardware to reap benefits from the Actor Model. Modern compilers, modern processors, and...

Summary


This chapter was a Scala-Akka dojo where you learnt through several Katas. In the first part we explored the fundamental parts of Scala; in the second part we focused on the Akka actor model.

It is true, there were many important topics not covered in this chapter, such as futures, promises, and parallel collections. But we tried to provide a reference to them, although not an exhaustive guide.

So, as all the book examples are in Scala, we need to master fundamental techniques before delve into the SMACK stack.

The Actor Model is important to understand the architecture and operation of Spark.

In the following chapter, we will explore Spark design and provide some examples using Scala.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Fast Data Processing Systems with SMACK Stack
Published in: Dec 2016Publisher: PacktISBN-13: 9781786467201
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada