Reader small image

You're reading from  MySQL 8 for Big Data

Product typeBook
Published inOct 2017
Reading LevelBeginner
PublisherPackt
ISBN-139781788397186
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Authors (4):
Shabbir Challawala
Shabbir Challawala
author image
Shabbir Challawala

Shabbir Challawala has over 8 years of rich experience in providing solutions based on MySQL and PHP technologies. He is currently working with KNOWARTH Technologies. He has worked in various PHP-based e-commerce solutions and learning portals for enterprises. He has worked on different PHP-based frameworks, such as Magento E-commerce, Drupal CMS, and Laravel. Shabbir has been involved in various enterprise solutions at different phases, such as architecture design, database optimization, and performance tuning. He has been carrying good exposure of Software Development Life Cycle process thoroughly. He has worked on integrating Big Data technologies such as MongoDB and Elasticsearch with a PHP-based framework.
Read more about Shabbir Challawala

Chintan Mehta
Chintan Mehta
author image
Chintan Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.
Read more about Chintan Mehta

Kandarp Patel
Kandarp Patel
author image
Kandarp Patel

Kandarp Patel leads PHP practices at KNOWARTH Technologies. He has vast experience in providing end-to-end solutions in CMS, LMS, WCM, and e-commerce, along with various integrations for enterprise customers. He has over 9 years of rich experience in providing solutions in MySQL, MongoDB, and PHP-based frameworks. Kandarp is also a certified MongoDB and Magento developer. Kandarp has experience in various Enterprise Application development phases of the Software Development Life Cycle and has played prominent role in requirement gathering, architecture design, database design, application development, performance tuning, and CD/CI. Kandarp has a Bachelor of Engineering in Information Technology from a reputed university in India.
Read more about Kandarp Patel

Jaydip Lakhatariya
Jaydip Lakhatariya
author image
Jaydip Lakhatariya

Jaydip Lakhatariya has rich experience in portal and J2EE frameworks. He adapts quickly to any new technology and has a keen desire for constant improvement. Currently, Jaydip is associated with a leading open source enterprise development company, KNOWARTH Technologies, where he is engaged in various enterprise projects. Jaydip, a full-stack developer, has proven his versatility by adopting technologies such as Liferay, Java, Spring, Struts, Hadoop, MySQL, Elasticsearch, Cassandra, MongoDB, Jenkins, SCM, PostgreSQL, and many more. He has been recognized with awards such as Merit, Commitment to Service, and also as a Star Performer. He loves mentoring people and has been delivering training for Portals and J2EE frameworks.
Read more about Jaydip Lakhatariya

View More author details
Right arrow

Evolution of MySQL for Big Data


Most enterprises have used MySQL as a relational database for many decades. There is a large amount of data stored, which is used either for transactions or analysis on the data that is collected and generated, and this is where Big Data analytic tools need to be implemented. This is now possible with MySQL integration with Hadoop. Using Hadoop, data can be stored in a distributed storage engine and you can also implement the Hadoop cluster for the distributed analytical engine for Big Data analytics. Hadoop is most preferred for its massive parallel processing and powerful computation. With the combination of MySQL and Hadoop, it is now possible to have real-time analytics where Hadoop can store the data and work in parallel with MySQL to show the end results in real time; this helps address many use cases like GIS information, which has been explained in the Introducing MySQL 8 section of this chapter. We have seen the Big Data life cycle previously where data can be transformed to generate analytic results. Let's see how MySQL fits in to the life cycle.

The following diagram illustrates how MySQL 8 is mapped to each of the four stages of the Big Data life cycle:

Acquiring data in MySQL

With the volume and velocity of data, it becomes difficult to transfer data in MySQL with optimal performance. To avoid this, Oracle has developed the NoSQL API to store data in the InnoDB storage engine. This will not do any kind of SQL parsing and optimization, hence, key/value data can be directly written to the MySQL tables with high speed transaction responses without sacrificing ACID guarantees. The MySQL cluster also supports different NoSQL APIs for Node.js, Java, JPA, HTTP/REST, and C++. We will explore this in detail later in the book, however, we need to keep in mind that using the NoSQL API, we can enable the faster processing of data and transactions in MySQL.

Organizing data in Hadoop

The next step is to organize data in the Hadoop filesystem once the data has been acquired and loaded to MySQL. Big Data requires some processing to produce analysis results where Hadoop is used to perform highly parallel processing. Hadoop is also a highly scalable distributed framework and is powerful in terms of computation. Here, the data is consolidated from different sources to process the analysis. To transfer the data between MySQL tables to HDFS, Apache Sqoop will be leveraged.

Analyzing data

Now it's time for analyzing data! This is the phase where MySQL data will be processed using the map reduce algorithm of Hadoop. We can use other analysis tools such as Apache Hive or Apache Pig to do similar analytical results. We can also perform custom analysis that can be executed on Hadoop, which returns the results set with the data analyzed and processed.

Results of analysis

The results that were analyzed from our previous phases are loaded back into MySQL, which can be done with the help of Apache Sqoop. Now MySQL has the analysis result that can be consumed by business intelligence tools such as Oracle BI Solution, Jasper Soft, Talend, and so on or other traditional ways using web applications that can generate various analytical reports and, if required, do real-time processing.

This is how MySQL fits easily into a Big Data solution. This architecture makes structured databases handle the Big Data analysis. To understand how to achieve this, refer to Chapter 9

,

Case study: Part I - Apache Sqoop for Exchanging Data between MySQL and Hadoop, and Chapter 10, Case study: Part II - Realtime event processing using MySQL applier, which cover a couple of real-world use cases where we discuss using MySQL 8 extensively and solving business problems to generate value from data.

Previous PageNext Page
You have been reading a chapter from
MySQL 8 for Big Data
Published in: Oct 2017Publisher: PacktISBN-13: 9781788397186
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (4)

author image
Shabbir Challawala

Shabbir Challawala has over 8 years of rich experience in providing solutions based on MySQL and PHP technologies. He is currently working with KNOWARTH Technologies. He has worked in various PHP-based e-commerce solutions and learning portals for enterprises. He has worked on different PHP-based frameworks, such as Magento E-commerce, Drupal CMS, and Laravel. Shabbir has been involved in various enterprise solutions at different phases, such as architecture design, database optimization, and performance tuning. He has been carrying good exposure of Software Development Life Cycle process thoroughly. He has worked on integrating Big Data technologies such as MongoDB and Elasticsearch with a PHP-based framework.
Read more about Shabbir Challawala

author image
Chintan Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.
Read more about Chintan Mehta

author image
Kandarp Patel

Kandarp Patel leads PHP practices at KNOWARTH Technologies. He has vast experience in providing end-to-end solutions in CMS, LMS, WCM, and e-commerce, along with various integrations for enterprise customers. He has over 9 years of rich experience in providing solutions in MySQL, MongoDB, and PHP-based frameworks. Kandarp is also a certified MongoDB and Magento developer. Kandarp has experience in various Enterprise Application development phases of the Software Development Life Cycle and has played prominent role in requirement gathering, architecture design, database design, application development, performance tuning, and CD/CI. Kandarp has a Bachelor of Engineering in Information Technology from a reputed university in India.
Read more about Kandarp Patel

author image
Jaydip Lakhatariya

Jaydip Lakhatariya has rich experience in portal and J2EE frameworks. He adapts quickly to any new technology and has a keen desire for constant improvement. Currently, Jaydip is associated with a leading open source enterprise development company, KNOWARTH Technologies, where he is engaged in various enterprise projects. Jaydip, a full-stack developer, has proven his versatility by adopting technologies such as Liferay, Java, Spring, Struts, Hadoop, MySQL, Elasticsearch, Cassandra, MongoDB, Jenkins, SCM, PostgreSQL, and many more. He has been recognized with awards such as Merit, Commitment to Service, and also as a Star Performer. He loves mentoring people and has been delivering training for Portals and J2EE frameworks.
Read more about Jaydip Lakhatariya