You're reading from Practical Machine Learning Cookbook
Subset selection: The use of labeled examples to induce a model that classifies objects into a finite set of known classes is one of the main challenges of supervised classification in machine learning. Vectors of numeric or nominal features are used to describe the various examples. In the feature subset selection problem, a learning algorithm is faced with the problem of selecting some subset of features upon which to focus its attention, while ignoring the rest.
When fitting a linear regression model, a subset of variables that best describe the data are of interest. There are a number of different ways the best subset, applying a number of different strategies, can be adopted when searching for a variables set. If there are m variables and the best regression model consists of p variables, p≤m, then a more general approach to pick the best subset might be to try all possible combinations of p variables and select the model that fits the data the best.
However, there are...
In order to compare the metabolic rate of humans, the concept of basal metabolic rate (BMR) is critical, in a clinical context, as a means of determining thyroid status in humans. The BMR of mammals varies with body mass, with the same allometric exponent as field metabolic rate, and with many physiological and biochemical rates. Fitbit, as a device, uses BMR and activities performed during the day to estimate calories burned throughout the day.
In order to perform shrinkage methods, we shall be using a dataset collected from Fitbit and a calories-burned dataset.
The dataset titled fitbit_export_20160806.csv
which is in CSV format shall be used. The dataset is in standard format. There are 30 rows of data and 10 variables. The numeric variables are as follows:
Calories Burned
Steps
Distance
Floors
Minutes Sedentary
Minutes Lightly Active
Minutes Fairly Active
ExAng
Minutes Very Active
Activity Calories
The...
Fleet planning is a part of the strategic planning process for any airline company. Fleet is the total number of aircraft that an airline operates, as well as the specific aircraft types that comprise the total fleet. Airline selection criteria for aircraft acquisition are based on technical/performance characteristics, economic and financial impact, environmental regulations and constraints, marketing considerations, and political realities. Fleet composition is a critical long-term strategic decision for an airline company. Each aircraft type has different technical performance characteristics, for example, the capacity to carry the payload over a maximum flight distance or range. It affects financial position, operating costs, and especially the ability to serve specific routes.
Food is a powerful symbol of who we are. There are many types of food identification, such as ethnic, religious, and class identifications. Ethnic food preferences become identity markers in the presence of gustatory foreigners, such as when one goes abroad, or when those foreigners visit the home shores.
In order to perform principal component analysis, we shall be using a dataset collected on the Epicurious recipe dataset.
Let's get into the details.
The first step is to load the following packages:
> install.packages("glmnet") > library(ggplot2) > library(glmnet)
Let's explore the data and understand the relationships among the variables. We'll begin by importing...