Reader small image

You're reading from  Learning Predictive Analytics with R

Product typeBook
Published inSep 2015
Reading LevelIntermediate
PublisherPackt
ISBN-139781782169352
Edition1st Edition
Languages
Right arrow
Author (1)
Eric Mayor
Eric Mayor
author image
Eric Mayor

Eric Mayor is a senior researcher and lecturer at the University of Neuchatel, Switzerland. He is an enthusiastic user of open source and proprietary predictive analytics software packages, such as R, Rapidminer, and Weka. He analyzes data on a daily basis and is keen to share his knowledge in a simple way.
Read more about Eric Mayor

Right arrow

C4.5


C4.5 works in ways similar to ID3, but uses the gain ratio as a partitioning criterion, which in part resolves the issue mentioned previously. Another advantage is that it accepts partition on numeric attributes, which it splits into categories. The value of the split is selected in order to decrease the entropy for the considered attribute. Other differences from ID3 are that C4.5 allows for post-pruning, which is basically the bottom up simplification of the tree to avoid overfitting to the training data.

The gain ratio

Using the gain ratio as partitioning criterion overcoming a shortcomes of ID3, which is to prefer attributes with many modalities as nodes because they have a higher information gain. The gain ratio divides the information gain by a value called split information. This value is computed as minus the sum of: the ratio of the number of cases in each modality of the attribute divided by the number of cases to partition upon, multiplied by the base 2 logarithm of the number...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Learning Predictive Analytics with R
Published in: Sep 2015Publisher: PacktISBN-13: 9781782169352

Author (1)

author image
Eric Mayor

Eric Mayor is a senior researcher and lecturer at the University of Neuchatel, Switzerland. He is an enthusiastic user of open source and proprietary predictive analytics software packages, such as R, Rapidminer, and Weka. He analyzes data on a daily basis and is keen to share his knowledge in a simple way.
Read more about Eric Mayor