7.3 BDL and sources of uncertainty
In this case study, we will look at how we can model aleatoric and epistemic uncertainty in a regression problem when we are trying to predict a continuous outcome variable. We will use a real-life dataset of diamonds that contains the physical attributes of more than 50,000 diamonds as well as their prices. In particular, we will look at the relationship between the weight of a diamond (measured as its carat) and the price paid for the diamond.
Step 1: Setting up the environment
To set up the environment, we import several packages. We import tensorflow
and tensorflow_probability
for building and training vanilla and probabilistic neural networks, tensorflow_datasets
for importing the diamonds data set, numpy
for performing calculations and operations on numerical arrays (such as calculating the mean), pandas
for handling DataFrames, and matplotlib
for plotting:
import matplotlib.pyplot as plt
import numpy as np ...