In the previous chapter, we talked about how to use Mahout and R to solve machine learning problems. In this chapter, we are going to talk about the latest sensation in the Big Data industry called Apache Spark. By now, everyone is aware, and has acknowledged the power of Apache Spark. This is a general and fast engine that processes large-scale data. It provides high-level APIs in Java, Scala, Python, and R. Spark can perform batch processing as well as stream processing. In this chapter, we are going to explore certain important topics related to Apache Spark such as batch processing, Spark SQL, streaming processing, machine learning with MLib
, and graph processing using Spark's GraphX
library. So, let's get started.
Argentina
Australia
Austria
Belgium
Brazil
Bulgaria
Canada
Chile
Colombia
Cyprus
Czechia
Denmark
Ecuador
Egypt
Estonia
Finland
France
Germany
Great Britain
Greece
Hungary
India
Indonesia
Ireland
Italy
Japan
Latvia
Lithuania
Luxembourg
Malaysia
Malta
Mexico
Netherlands
New Zealand
Norway
Philippines
Poland
Portugal
Romania
Russia
Singapore
Slovakia
Slovenia
South Africa
South Korea
Spain
Sweden
Switzerland
Taiwan
Thailand
Turkey
Ukraine
United States