Reader small image

You're reading from  Automated Machine Learning with AutoKeras

Product typeBook
Published inMay 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800567641
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Luis Sobrecueva
Luis Sobrecueva
author image
Luis Sobrecueva

Luis Sobrecueva is a senior software engineer and ML/DL practitioner currently working at Cabify. He has been a contributor to the OpenAI project as well as one of the contributors to the AutoKeras project.
Read more about Luis Sobrecueva

Right arrow

Splitting your dataset for training and evaluation

To evaluate a model, you must divide your dataset into three subsets: a training set, a validation set, and a test set. During the training phase, AutoKeras will train your model with the training dataset, while using the validation dataset to evaluate its performance. Once you are ready, the final evaluation will be done using the test dataset.

Why you should split your dataset

Having a separate test dataset that is not used during training is really important to avoid information leaks.

As we mentioned previously, the validation set is used to tune the hyperparameters of your model based on the performance of the model, but some information about the validation data is filtered into the model. Due to this, you run the risk of ending up with a model that works artificially well with the validation data, because that's what you trained it for. However, the actual performance of the model is due to us using previously...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Automated Machine Learning with AutoKeras
Published in: May 2021Publisher: PacktISBN-13: 9781800567641

Author (1)

author image
Luis Sobrecueva

Luis Sobrecueva is a senior software engineer and ML/DL practitioner currently working at Cabify. He has been a contributor to the OpenAI project as well as one of the contributors to the AutoKeras project.
Read more about Luis Sobrecueva