This article created by Jonathan R. Owens, Jon Lentz, and Brian Femiano, authors of Hadoop Real-World Solutions Cookbook, contains recipes designed to show how you can put Hadoop to use to answer different questions about your data. Several of the Hive examples will demonstrate how to properly implement and use a custom function (UDF) for reuse in different analytics. There are two Pig recipes that show different analytics with the Audioscrobbler dataset and one MapReduce Java API recipe that shows Combiners.
In this article, we will cover:
Counting distinct IPs in weblog data using MapReduce and Combiners
Using Hive date UDFs to transform and sort event dates from geographic event data
Using Hive to build a per-month report of fatalities over geographic event data
Implementing a custom UDF in Hive to help validate source reliability over geographic event data
Marking the longest period of non-violence using Hive MAP/REDUCE operators and Python
Calculating the cosine similarity of Artists in the Audioscrobbler dataset using Pig
Trim outliers from the Audioscrobbler dataset using Pig and datafu
Learning to apply Apache Hive, Pig, and MapReduce to solve the specific problems you are faced with can be difficult. The recipes in this article present a few big data problems and provide solutions that show how to tackle them. You will notice that the questions we ask of the data are not incredibly complicated, but you will require a different approach when dealing with a large volume of data. Even though the sample datasets in the recipes are small, you will find that the code is still very applicable to bigger problem spaces distributed over large Hadoop clusters.
The analytic questions in this article are designed to highlight many of the more powerful features of the various tools. You will find many of these features and operators useful as you begin solving your own problems.Read Big Data Analysis in full
The article, Using Debug Perspective, will guide you through the ways of setting up breakpoints and navigate through the code using various breakpoint manipulation options. This article by Anatoly Spektor, author of Instant Eclipse Application Testing How-to, will guide you to learn what breakpoints are and how to use them. After reading this article, you will be able to effectively debug Java applications of any scope. Fortunately, any prior knowledge of Eclipse is not required; thus it is suitable for developers with any level of experience in Eclipse application development and testing.Read Using Debug Perspective – setting breakpoints in full
In this article by Vinod Krishnan, author of Oracle ADF 11gR2 Development Beginner's Guide, we will take a look at validating and using the model data. Validating data is important as business depends on the data that gets stored in the database. So how do we validate the data? Validation is something that makes sure that valid data is getting stored in the database. Validation could be anything from comparing two fields in a table to multiple validations on a single field involving different columns from a different table.
In any other framework, we would end up writing a lot of code even for a small validation. But in ADF, we do little or no coding at all, and most of the validations are achieved declaratively.
In this article, we will learn the following topics:
- Declarative validation
- Groovy expressions
Another alternative much discussed by the community of developers is transforming the table into a graphic when it is being displayed on small screen devices. This is not an easy task taking into account the size and amount of data that a table can have.
Let's see an alternative solution combining the previous recipes with another plugin for rendering graphics. The main reason for this combination is we use only one plugin per page, thus optimizing our load.
This article by Fernando Monteiro, the author of the book, Instant HTML5 Responsive Table Design How-to explains what happens when we convert the data and display a nice graphic for our users using a properly formatted table.Read Converting tables into graphs (Advanced) in full
In this article by Cyrille Rossant, author of Learning IPython for Interactive Computing and Data Visualization, we will take a quick tour of IPython by introducing 10 essential features of this powerful tool. Although brief, this hands-on visit will cover a wide range of IPython functionalities.Read Ten IPython essentials in full