Masking DataFrame rows
The mask method performs the exact opposite operation that the where method does. By default, it creates missing values wherever the boolean condition is True. In essence, it is literally masking, or covering up, values in your dataset.
Getting ready
In this recipe, we will mask all rows of the movie dataset that were made after 2010 and then filter all the rows with missing values.
How to do it...
- Read the
moviedataset, set the movie title as the index, and create the criteria:
>>> movie = pd.read_csv('data/movie.csv', index_col='movie_title')
>>> c1 = movie['title_year'] >= 2010
>>> c2 = movie['title_year'].isnull()
>>> criteria = c1 | c2- Use the
maskmethod on a DataFrame to make all the values in rows with movies that were made from 2010 onward missing. Any movie that originally had a missing value fortitle_yearis also masked:
>>> movie.mask(criteria).head()

- Notice how all the values in the third, fourth, and fifth rows...