Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Practical Big Data Analytics

You're reading from  Practical Big Data Analytics

Product type Book
Published in Jan 2018
Publisher Packt
ISBN-13 9781783554393
Pages 412 pages
Edition 1st Edition
Languages
Concepts
Author (1):
Nataraj Dasgupta Nataraj Dasgupta
Profile icon Nataraj Dasgupta

Table of Contents (16) Chapters

Title Page
Packt Upsell
Contributors
Preface
Too Big or Not Too Big Big Data Mining for the Masses The Analytics Toolkit Big Data With Hadoop Big Data Mining with NoSQL Spark for Big Data Analytics An Introduction to Machine Learning Concepts Machine Learning Deep Dive Enterprise Data Science Closing Thoughts on Big Data External Data Science Resources Other Books You May Enjoy

Chapter 2. Big Data Mining for the Masses

Implementing a big data mining platform in an enterprise environment that serves specific business requirements is non-trivial. While it is relatively simple to build a big data platform, the novel nature of the tools present a challenge in terms of adoption by business-facing users used to traditional methods of data mining. This, ultimately, is a measure of how successful the platform becomes within an organization.

This chapter introduces some of the salient characteristics of big data analytics relevant for both practitioners and end users of analytics tools. This will include the following topics:

  • What is big data mining?
  • Big data mining in the enterprise:
    • Building a use case
    • Stakeholders of the solution
    • Implementation life cycle
  • Key technologies in big data mining:
    • Selecting the hardware stack:
      • Single/multinode architecture
      • Cloud-based environments
    • Selecting the software stack:
      • Hadoop, Spark, and NoSQL
      • Cloud-based environments

What is big data mining?


Big data mining forms the first of two broad categories of big data analytics, the other being Predictive Analytics, which we will cover in later chapters. In simple terms, big data mining refers to the entire life cycle of processing large-scale datasets, from procurement to implementation of the respective tools to analyze them.

The next few chapters will illustrate some of the high-level characteristics of any big data project that is undertaken in an organization.

Big data mining in the enterprise

Implementing a big data solution in a medium to large size enterprise can be a challenging task due to the extremely dynamic and diverse range of considerations, not the least of which is determining what specific business objectives the solution will address.

Building the case for a Big Data strategy

Perhaps the most important aspect of big data mining is determining the appropriate use cases and needs that the platform would address. The success of any big data platform...

Technical elements of the big data platform


Our discussion, so far, has been focused on the high-level characteristics of design and deployment of big data solutions in an enterprise environment. We will now shift attention to the technical aspects of such undertakings. From time to time, we’ll incorporate high-level messages where appropriate in addition to the technical underpinnings of the topics in discussion.

At the technical level, there are primarily two main considerations:

  • Selection of the hardware stack
  • Selection of the software and BI (business intelligence) platform

Over the recent 2-3 years, it has become increasingly common for corporations to move their processes to cloud-based environments as a complementary solution for in-house infrastructures. As such, cloud-based deployments have become exceedingly common and hence, an additional section on on-premises versus cloud-based has been added. Note that the term On-premises can be used interchangeably with In-house, On-site, and...

Summary


In this chapter, we got a high-level overview of Big Data and some of the components of implementing a Big Data solution in the Enterprise. Big Data requires selection of an optimal software and hardware stack, an effort that is non-trivial, not least because of the hundreds of solutions in the industry. Although the topic of a Big Data strategy may be deemed as a subject best left for management rather than a technical audience, it is essential to understand the nuances.

Note that without a proper, well-defined strategy and corresponding high level support, IT departments will remain limited in the extent to which they can provide successful solutions. Further, the solution, including the hardware-software stack should be such that it can be adequately managed and supported by existing IT resources. Most companies will find that it would be essential to recruit new hires for the Big Data implementation. Since such implementations require evaluation of various elements - business...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Practical Big Data Analytics
Published in: Jan 2018 Publisher: Packt ISBN-13: 9781783554393
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}