To filter a pandas DataFrame based on the occurrences of categories, you might attempt to use df.groupby
and df.count
. However, since the Series returned by the count
method is shorter than the original DataFrame, you will get an error when filtering.
Instead of using count, use transform
. This method will return the Series with the same length as the original DataFrame. Now you can filter without encountering any error.
You can play with the code in this Colab notebook.