Chapter 2: Pre-Model Workflow and Data Preprocessing
Join our book community on Discord
 
https://packt.link/EarlyAccessCommunity
Without question, the quality of your data is the most important element in achieving successful outcomes from ML models. Unfortunately, many of the training courses and content you find today glosses over this critical aspect since, for lack of a better way of putting it, it can be seen as “dull” or “boring” or even “tedious” compared to the actual training and deployment of models into production environments. Regardless of your own perceptions, keep in mind that the idiom “garbage in, garbage out” absolutely relates to what will happen to your ML pipeline. Furthermore, as data changes over time, data preprocessing isn’t just a one-time endeavor, but an ongoing task that must be accounted for.
This chapter will help you learn essential data preprocessing techniques such as handling missing data...
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                