Classifier-building life cycle
At the top-level building, a classifier usually proceeds as follows:
Create training data—refer to the following recipe for more about this.
Build training and evaluation infrastructure with sanity check.
Establish baseline performance.
Select optimization metric for classifier—this is what the classifier is trying to do and will guide tuning.
Optimize classifier via techniques such as:
Parameter tuning
Thresholding
Linguistic tuning
Adding training data
Refining classifier definition
This recipe will present the first four steps in concrete terms, and there are recipes in this chapter for the optimization step.
Getting ready
Nothing happens without training data for classifiers. Look at the Annotation recipe at the end of the chapter for tips on creating training data. You can also use an active learning framework to incrementally generate a training corpus (covered later in this chapter), which is the data used in this recipe.
Next, reduce the risk by starting with the...