Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Fast Data Processing Systems with SMACK Stack
Fast Data Processing Systems with SMACK Stack

Fast Data Processing Systems with SMACK Stack: Combine the incredible powers of Spark, Mesos, Akka, Cassandra, and Kafka to build data processing platforms that can take on even the hardest of your data troubles!

By Raúl Estrada
$43.99 $29.99
Book Dec 2016 376 pages 1st Edition
eBook
$43.99 $29.99
Print
$54.99
Subscription
$15.99 Monthly
eBook
$43.99 $29.99
Print
$54.99
Subscription
$15.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Dec 22, 2016
Length 376 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781786467201
Vendor :
Apache
Category :
Table of content icon View table of contents Preview book icon Preview Book

Fast Data Processing Systems with SMACK Stack

Chapter 1.  An Introduction to SMACK

The goal of this chapter is to present data problems and scenarios solved by architecture. This chapter explains how every technology contributes to the SMACK stack. It also explains how this modern pipeline architecture solves many of the modern problems related to data-processing environments. Here we will know when to use SMACK and when it is not suitable. We will also touch on the new professional profiles created in the new data management era.

In this chapter we will cover the following topics:

  • Modern data-processing challenges
  • The data-processing pipeline architecture
  • SMACK technologies
  • Changing the data center operations
  • Data expert profiles
  • Is SMACK for me?

Modern data-processing challenges


We can enumerate four modern data-processing problems as follows:

  • Size matters: In modern times, data is getting bigger or, more accurately, the number of available data sources is increasing. In the previous decade, we could precisely identify our company's internal data sources: Customer Relationship Management (CRM), Point of Sale (POS), Enterprise Resource Planning (ERP), Supply Chain Management (SCM), and all our databases and legacy systems. Easy, a system that is not internal is external. Today, it is exactly the same, except not do the data sources multiply over time, the amount of information flowing from external systems is also growing at almost logarithmic rates. New data sources include social networks, banking systems, stock systems, tracking and geolocation systems, monitoring systems, sensors, and the Internet of Things; if a company's architecture is incapable of handling these use cases, then it can't respond to upcoming challenges.
  • Sample data: Obtaining a sample of production data is becoming more difficult. In the past, data analysts could have a fresh copy of production data on their desks almost daily. Today, it becomes increasingly more difficult, either because of the amount of data to be moved or by the expiration date; in many modern business models data from an hour ago is practically obsolete.
  • Data validity: The validity of an analysis becomes obsolete faster. Assuming that the fresh-copy problem is solved, how often is new data needed? Looking for a trend in the last year is different from looking for one in the last few hours. If samples from a year ago are needed, what is the frequency of these samples? Many modern businesses don't even have this information, or worse, they have it but it is only stored.
  • Data Return on Investment (ROI): Data analysis becomes too slow to get any return on investment from the info. Now, suppose you have solved the problems of sample data and data validity. The challenge is to be able to analyze information in a timely manner so that the return on investment of all our efforts is profitable. Many companies invest in data, but never get the analysis to increase their income.

We can enumerate modern data needs which are as follows:

  • Scalable infrastructure: Companies, every time, have to weigh the time and money spent. Scalability in a data center means the center should grow in proportion to the business growth. Vertical scalability involves adding more layers of processing. Horizontal scalability means that once a layer has more demands and requires more infrastructures, hardware can be added so that processing needs are met. One modern requirement is to have horizontal scaling with low-cost hardware.
  • Geographically dispersed data centers: Geographically centralized data centers are being displaced. This is because companies need to have multiple data centers in multiple locations for several reasons: cost, ease of administration, or access to users. This implies a huge challenge for data center management. On the other hand, data center unification is a complex task.
  • Allow data volumes to be scaled as the business needs: The volume of data must scale dynamically according to business demands. So, as you can have a lot of demand at a certain time of day, you can have high demand in certain geographic regions. Scaling should be dynamically possible in time and space especially horizontally.
  • Faster processing: Today, being able to work in real time is fundamental. We live in an age where data freshness matters many times more than the amount or size of data. If the data is not processed fast enough, it becomes stale quickly. Fresh information not only needs to be obtained in a fast way, it has to be processed quickly.
  • Complex processing: In the past, the data was smaller and simpler. Raw data doesn't help us much. The information must be processed by several layers, efficiently. The first layers are usually purely technical and the last layers mainly business-oriented. Processing complexity can kill of the best business ideas.
  • Constant data flow: For cost reasons, the number of data warehouses is decreasing. The era when data warehouses served just to store data is dying. Today, no one can afford data warehouses just to store information. Today, data warehouses are becoming very expensive and meaningless. The better business trend is towards flows or streams of data. Data no longer stagnates, it moves like large rivers. Make data analysis on big information torrents one of the objectives of modern businesses.
  • Visible, reproducible analysis: If we cannot reproduce phenomena, we cannot call ourselves scientists. Modern science data requires making reports and graphs in real time to take timely decisions. The aim of science data is to make effective predictions based on observation. The process should be visible and reproducible.

The data-processing pipeline architecture


If you ask several people from the information technology world, we agree on few things, except that we are always looking for a new acronym, and the year 2015 was no exception.

As this book title says, SMACK stands for Spark, Mesos, Akka, Cassandra, and Kafka. All these technologies are open source. And with the exception of Akka, all are Apache Software projects. This acronym was coined by Mesosphere, a company that bundles these technologies together in a product called Infinity, designed in collaboration with Cisco to solve some pipeline data challenges where the speed of response is fundamental, such as in fraud detection engines.

SMACK exists because one technology doesn't make an architecture. SMACK is a pipelined architecture model for data processing. A data pipeline is software that consolidates data from multiple sources and makes it available to be used strategically.

It is called a pipeline because each technology contributes with its characteristics to a processing line similar to a traditional industrial assembly line. In this context, our canonical reference architecture has four parts: storage, the message broker, the engine, and the hardware abstraction.

For example, Apache Cassandra alone solves some problems that a modern database can solve but, given its characteristics, leads the storage task in our reference architecture.

Similarly, Apache Kafka was designed to be a message broker, and by itself solves many problems in specific businesses; however, its integration with other tools deserves a special place in our reference architecture over its competitors.

The NoETL manifesto

The acronym ETL stands for Extract, Transform, Load. In the database data warehousing guide, Oracle says:

Designing and maintaining the ETL process is often considered one of the most difficult and resource intensive portions of a data warehouse project.

For more information, refer to http://docs.oracle.com/cd/B19306_01/server.102/b14223/ettover.htm.

Contrary to many companies' daily operations, ETL is not a goal, it is a step, a series of unnecessary steps:

  • Each ETL step can introduce errors and risk
  • It can duplicate data after failover
  • Tools can cost millions of dollars
  • It decreases throughput
  • It increases complexity
  • It writes intermediary files
  • It parses and re-parses plain text
  • It duplicates the pattern over all our data centers

No ETL pipelines fit on the SMACK stack: Spark, Mesos, Akka, Cassandra, and Kafka. And if you use SMACK, make sure it's highly-available, resilient, and distributed.

A good sign you're having Etlitis is writing intermediary files. Files are useful in day to day work, but as data types they are difficult to handle. Some programmers advocate replacing a file system with a better API.

Removing the E in ETL: Instead of text dumps that you need to parse over multiple systems, Scala and Parquet technologies, for example, can work with binary data that remains strongly typed and represent a return to strong typing in the data ecosystem.

Removing the L in ETL: If data collection is backed by a distributed messaging system (Kafka, for example) you can do a real-time fan-out of the ingested data to all customers. No need to batch-load.

The T in ETL: From this architecture, each consumer can do their own transformations.

So, the modern tendency is: no more Greek letter architectures, no more ETL.

Lambda architecture

The academic definition is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream processing methods. The problem arises when we need to process data streams in real time.

Here, a special mention for two open source projects that allow batch processing and real-time stream processing in the same application: Apache Spark and Apache Flink. There is a battle between these two: Apache Spark is the solution led by Databricks, and Apache Flink is a solution led by data artisans.

For example, Apache Spark and Apache Cassandra meets two modern requirements described previously:

  • It handles a massive data stream in real time
  • It handles multiple and different data models from multiple data sources

Most lambda solutions, as mentioned, cannot meet these two needs at the same time. As a demonstration of power, using an architecture based only on these two technologies, Apache Spark is responsible for real-time analysis of both historical data and recent data obtained from the massive information torrent. All such information and analysis results are persisted in Apache Cassandra. So, in the case of failure we can recover real-time data from any point of time. With lambda architecture it's not always possible.

Hadoop

Hadoop was designed to transfer processing closer to the data to minimize the amount of data shuffled across the network. It was designed with data warehouse and batch problems in mind; it fits into the slow data category, where size, scope, and completeness of data are more important than the speed of response.

The analogy is the sea versus the waterfall. In a sea of information you have a huge amount of data, but it is a static, contained, motionless sea, perfect to do Batch processing without time pressures. In a waterfall you have a huge amount of data, dynamic, not contained, and in motion. In this context your data often has an expiration date; after time passes it is useless.

Some Hadoop adopters have been left questioning the true return on investment of their projects after running for a while; this is not a technological fault itself, but a case of whether it is the right application. SMACK has to be analyzed in the same way.

SMACK technologies


SMACK is about a full stack for pipeline data architecture--it's Spark, Mesos, Akka, Cassandra, and Kafka. Further on in the book, we will also talk about the most important factor: the integration of these technologies.

Pipeline data architecture is required for online data stream processing, but there are a lot of books talking about each technology separately. This book talks about the entire full stack and how to perform integration.

This book is a compendium of how to integrate these technologies in a pipeline data architecture.

We talk about the five main concepts of pipeline data architecture and how to integrate, replace, and reinforce every layer:

  • The engine: Apache Spark
  • The actor model: Akka
  • The storage: Apache Cassandra
  • The message broker: Apache Kafka
  • The hardware scheduler: Apache Mesos:

Figure 1.1 The SMACK pipeline architecture

Apache Spark

Spark is a fast and general engine for data processing on a large scale.

The Spark goals are:

  • Fast data processing
  • Ease of use
  • Supporting multiple languages
  • Supporting sophisticated analytics
  • Real-time stream processing
  • The ability to integrate with existing Hadoop data
  • An active and expanding community

Here is some chronology:

  • 2009: Spark was initially started by Matei Zaharia at UC Berkeley AMPLab
  • 2010: Spark is open-sourced under a BSD license
  • 2013: Spark was donated to the Apache Software Foundation and its license to Apache 2.0
  • 2014: Spark became a top-level Apache Project
  • 2014: The engineering team at Databricks used Spark and set a new world record in large-scale sorting

As you are reading this book, you probably know all the Spark advantages. But here, we mention the most important:

  • Spark is faster than Hadoop: Spark makes efficient use of memory and it is able to execute equivalent jobs 10 to 100 times faster than Hadoop's MapReduce.
  • Spark is easier to use than Hadoop: You can develop in four languages: Scala, Java, Python, and recently R. Spark is implemented in Scala and Akka. When you work with collections in Spark it feels as if you are working with local Java, Scala, or Python collections. For practical reasons, in this book we only provide examples on Scala.
  • Spark scales differently than Hadoop: In Hadoop, you require experts in specialized Hardware to run monolithic Software. In Spark, you can easily increase your cluster horizontally with new nodes with non-expensive and non-specialized hardware. Spark has a lot of tools for you to manage your cluster.
  • Spark has it all in a single framework: The capabilities of coarse grained transformations, real-time data-processing functions, SQL-like handling of structured data, graph algorithms, and machine learning.

It is important to mention that Spark was made with Online Analytical Processing (OLAP) in mind, that is, batch jobs and data mining. Spark was not designed for Online Transaction Processing (OLTP), that is, fast and numerous atomic transactions; for this type of processing, we strongly advise the reader to consider the use of Erlang/Elixir.

Apache Spark has these main components:

  • Spark Core
  • Spark SQL
  • Spark Streaming
  • Spark MLIB
  • Spark Graph

The reader will find that each Spark component normally has several books. In this book, we just mention the essentials of Apache Spark to meet the SMACK stack.

In the SMACK stack, Apache Spark is the data-processing engine; it provides near real-time analysis of data (note the word near, because today processing petabytes of data cannot be done in real time).

Akka

Akka is an actor model implementation for JVM, it is a toolkit and runtime for building highly concurrent, distributed, and resilient message-driven applications on the JVM.

The open source Akka toolkit was first released in 2009. It simplifies the construction of concurrent and distributed Java applications. Language bindings exist for both Java and Scala.

It is message-based and asynchronous; typically no mutable data is shared. It is primarily designed for actor-based concurrency:

  • Actors are arranged hierarchically
  • Each actor is created and supervised by its parent actor
  • Program failures treated as events are handled by an actor's supervisor
  • It is fault-tolerant
  • It has hierarchical supervision
  • Customizable failure strategies and detection
  • Asynchronous data passing
  • Parallelized
  • Adaptive and predictive
  • Load-balanced

Apache Cassandra

Apache Cassandra is a database with the scalability, availability, and performance necessary to compete with any database system in its class. We know that there are better database systems; however, Apache Cassandra is chosen because of its performance and its connectors built for Spark and Mesos.

In SMACK, Akka, Spark, and Kafka can store the data in Cassandra as a data layer. Also, Cassandra can handle operational data. Cassandra can also be used to serve data back to the application layer.

Cassandra is an open source distributed database that handles large amounts of data; originally started by Facebook in 2008, it became a top-level Apache Project from 2010.

Here are some Apache Cassandra features:

  • Extremely fast
  • Extremely scalable
  • Multi datacenters
  • There is no single point of failure
  • Can survive regional faults
  • Easy to operate
  • Automatic and configurable replication
  • Flexible data modeling
  • Perfect for real-time ingestion
  • Great community

Apache Kafka

Apache Kafka is a distributed commit log, an alternative to publish-subscribe messaging.

Kafka stands in SMACK as the ingestion point for data, possibly on the application layer. This takes data from one or more applications and streams it across to the next points in the stack.

Kafka is a high-throughput distributed messaging system that handles massive data load and avoids back pressure systems to handle floods. It inspects incoming data volumes, which is very important for distribution and partitioning across the nodes in the cluster.

Some Apache Kafka features:

  • High-performance distributed messaging
  • Decouples data pipelines
  • Massive data load handling
  • Supports a massive number of consumers
  • Distribution and partitioning between cluster nodes
  • Broker automatic failover

Apache Mesos

Mesos is a distributed systems kernel. Mesos abstracts all the computer resources (CPU, memory, storage) away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to be built easily and run effectively.

Mesos was build using Linux kernel principles and was first presented in 2009 (with the name Nexus). Later in 2011, it was presented by Matei Zaharia.

Mesos is the foundation of several frameworks; the main three are:

  • Apache Aurora
  • Chronos
  • Marathon

In SMACK, Mesos' task is to orchestrate the components and manage resources used.

Changing the data center operations


And here is the point where data processing changes data center operation.

From scale-up to scale-out

Throughout businesses we are moving from specialized, proprietary, and typically expensive supercomputers to the deployment of clusters of commodity machines connected with a low cost network.

The Total Cost of Ownership (TCO) determines the fate, quality, and size of a DataCenter. If the business is small, the DataCenter should be small; as the business demands, the DataCenter will grow or shrink.

Currently, one common practice is to create a dedicated cluster for each technology. This means you have a Spark cluster, a Kafka cluster, a Storm cluster, a Cassandra cluster, and so on, because the overall TCO tends to increase.

The open-source predominance

Modern organizations adopt open source to avoid two old and annoying dependencies: vendor lock-in and external entity bug fixing.

In the past, the rules were dictated from the classically large high-tech enterprises or monopolies. Today, the rules come from the people, for the people; transparency is ensured through community-defined APIs and various bodies, such as the Apache Software Foundation or the Eclipse Foundation, which provide guidelines, infrastructure, and tooling for the sustainable and fair advancement of these technologies.

There is no such thing as a free lunch. In the past, larger enterprises used to hire big companies in order to be able to blame and sue someone in the case of failure. Modern industries should take the risk and invest in training their people in open technologies.

Data store diversification

The dominant and omnipotent era of the relational database is challenged by the proliferation of NoSQL .

You have to deal with the consequences: systems of recording determination, synchronizing different stores, and correct data store selection.

Data gravity and data locality

Data gravity is related to considering the overall cost associated with data transfer, in terms of volume and tooling, for example, trying to restore hundreds of terabytes in a disaster recovery case.

Data locality is the idea of bringing the computation to the data rather than the data to the computation. As a rule of thumb, the more different services you have on the same node, the better prepared you are.

DevOps rules

DevOps refers to the best practices for collaboration between the software development and operational sides of a company.

The developer team should have the same environment for local testing as is used in production. For example, Spark allows you to go from testing to cluster submission.

The tendency is to containerize the entire production pipeline.

Data expert profiles


Well, first we will classify people into four groups based on skill: data architect, data analyst, data engineer, and data scientist.

Usually, data skills are separated into two broad categories:

  1. Engineering skills: All the DevOps (yes, DevOps is the new black): setting up servers and clusters, operating systems, write/optimize/distribute queries, network protocol knowledge, programming, and all the stuff related to computer science
  2. Analytical skills: All mathematical knowledge: statistics, multivariable analysis, matrix algebra, data mining, machine learning, and so on.

Data analysts and data scientists have different skills but usually have the same mission in the enterprise.

Data engineers and data architects have the same skills but usually different work profiles.

Data architects

Large enterprises collect and generate a lot of data from different sources:

  1. Internal sources: Owned systems, for example, CRM, HRM, application servers, web server logs, databases, and so on.
  2. External sources: For example, social network platforms (WhatsApp, Twitter, Facebook, Instagram), stock market feeds, GPS clients, and so on.

Data architects:

  • Understand all these data sources and develop a plan for collecting, integrating, centralizing, and maintaining all the data
  • Know the relationship between data and current operations, and understand the effects that any process change has on the data used in the organization
  • Have an end-to-end vision of the processes, and see how a logical design maps a physical design, and how the data flows through every stage in the organization
  • Design data models (for example, relational databases)
  • Develop strategies for all data lifecycles: acquisition, storage, recovery, cleaning, and so on

Data engineers

A data engineer is a hardcore engineer who knows the internals of the data engines (for example, database software).

Data engineers:

  • Can install all the infrastructure (database systems, file systems)
  • Write complex queries (SQL and NoSQL)
  • Scale horizontally to multiple machines and clusters
  • Ensure backups and design and execute disaster recovery plans
  • Usually have low-level expertise in different data engines and database software

Data analysts

Their primary tasks are the compilation and analysis of numerical information.

Data analysts:

  • Have computer science and business knowledge
  • Have analytical insights into all the organization's data
  • Know which information makes sense to the enterprise
  • Translate all this into decent reports so the non-technical people can understand and make decisions
  • Do not usually work with statistics
  • Are present (but specialized) in mid-sized organizations for example, sales analysts, marketing analysts, quality analysts, and so on
  • Can figure out new strategies and report to the decision makers

Data scientists

This is a modern phenomenon and is usually associated with modern data. Their mission is the same as that of a data analyst but, when the frequency, velocity, or volume of data crosses a certain level, this position has specific and sophisticated skills to get those insights out.

Data scientists:

  • Have overlapping skills, including but not limited to: Database system engineering (DB engines, SQL, NoSQL), big data systems handling (Hadoop, Spark), computer language knowledge (R, Python, Scala), mathematics (statistics, multivariable analysis, matrix algebra), data mining, machine learning, and so on
  • Explore and examine data from multiple heterogeneous data sources (unlike data analysts)
  • Can sift through all incoming data to discover a previously hidden insight
  • Can make inductions, deductions, and abductions of data to solve a business problem or find a business pattern (usually data analysts just make inductions of from data)
  • The best don't just address known business problems, they find patterns to solve new problems and add value to the organization

As you can deduce, this book is mainly focused on the data architect and data engineer profiles. But if you're an enthusiastic data scientist looking for more wisdom, we hope to be useful to you, too.

Is SMACK for me?


Some large companies are using a variation of SMACK in production, particularly those looking at how to take their pipeline data projects forward.

Apache Spark is beginning to attract more large software vendors to support it as it fulfils different needs than Hadoop.

SMACK is becoming a new modern requirement for companies as they move from the initial pilot phases into relying on pipeline data for their revenues.

The point of this book is to give you alternatives.

One example involves replacing individual components. Yarn could be used as the cluster scheduler instead of Mesos, while Apache Flink would be a suitable batch and stream processing alternative to Akka. There are many alternatives to SMACK.

The fundamental premise of SMACK is to build an end-to-end data-processing pipeline having these components interacting in a way that makes integration simple and getting tasks up-and-running is quick, rather than requiring huge amounts of effort to get the tools to play nicely with each other.

Summary


This chapter was full of theory. We reviewed the fundamental SMACK architecture. We also reviewed the differences between Spark and traditional big data technologies such as Hadoop and MapReduce.

We also reviewed every technology in SMACK and we briefly exposed each tool's  is addressed in the following chapters potential. Each every technology. We will explore connectors and integration practices, as well as describing technology alternatives in every case.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • This highly practical guide shows you how to use the best of the big data technologies to solve your response-critical problems
  • Learn the art of making cheap-yet-effective big data architecture without using complex Greek-letter architectures
  • Use this easy-to-follow guide to build fast data processing systems for your organization

Description

SMACK is an open source full stack for big data architecture. It is a combination of Spark, Mesos, Akka, Cassandra, and Kafka. This stack is the newest technique developers have begun to use to tackle critical real-time analytics for big data. This highly practical guide will teach you how to integrate these technologies to create a highly efficient data analysis system for fast data processing. We’ll start off with an introduction to SMACK and show you when to use it. First you’ll get to grips with functional thinking and problem solving using Scala. Next you’ll come to understand the Akka architecture. Then you’ll get to know how to improve the data structure architecture and optimize resources using Apache Spark. Moving forward, you’ll learn how to perform linear scalability in databases with Apache Cassandra. You’ll grasp the high throughput distributed messaging systems using Apache Kafka. We’ll show you how to build a cheap but effective cluster infrastructure with Apache Mesos. Finally, you will deep dive into the different aspect of SMACK using a few case studies. By the end of the book, you will be able to integrate all the components of the SMACK stack and use them together to achieve highly effective and fast data processing.

What you will learn

[*] Design and implement a fast data Pipeline architecture [*] Think and solve programming challenges in a functional way with Scala [*] Learn to use Akka, the actors model implementation for the JVM [*] Make on memory processing and data analysis with Spark to solve modern business demands [*] Build a powerful and effective cluster infrastructure with Mesos and Docker [*] Manage and consume unstructured and No-SQL data sources with Cassandra [*] Consume and produce messages in a massive way with Kafka

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Dec 22, 2016
Length 376 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781786467201
Vendor :
Apache
Category :

Table of Contents

15 Chapters
Fast Data Processing Systems with SMACK Stack Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon
www.PacktPub.com Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
An Introduction to SMACK Chevron down icon Chevron up icon
The Model - Scala and Akka Chevron down icon Chevron up icon
The Engine - Apache Spark Chevron down icon Chevron up icon
The Storage - Apache Cassandra Chevron down icon Chevron up icon
The Broker - Apache Kafka Chevron down icon Chevron up icon
The Manager - Apache Mesos Chevron down icon Chevron up icon
Study Case 1 - Spark and Cassandra Chevron down icon Chevron up icon
Study Case 2 - Connectors Chevron down icon Chevron up icon
Study Case 3 - Mesos and Docker Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.