Processing, analyzing, and summarizing data using visualizations
We’re working in real estate now, and since we want to do well, we really want to build an algorithm that helps us analyze data and predict housing prices. But let’s think about that for a second. We can define that problem very broadly or narrowly. We can do a pricing analysis for all houses in a state or houses with three bedrooms or more in a neighborhood. Does performing the analysis matter? Maybe. But isn’t that why we want to look at this problem?
Let’s start by gathering some data. For this problem, we’re using the kv_house_data.csv dataset, which is available in our GitHub repository. To look at this dataset, we’ll need quite a few libraries. We’ve been talking about pandas mostly, yes, but we want to also do visualizations and perform some analysis, so we’ll need seaborn, numpy, and matplotlib. The full algorithm can be found in the ch16_housePrice_prediction...