How to remove DataFrame rows where a column's values are in a set?
I have a set
I want to remove all rows in a dataframe where a column value is in that set.
df = df[df.column_in_set not in remove_set]
This gives me the error:
'Series' objects are mutable, thus they cannot be hashed.
What is the most pandas/pythonic way to solve this problem? I could iterate through the rows and figure out the the ilocs to exclude, but that seems a little inelegant.
Some sample input and expected output.
column_in_set value_2 value_3 1 'a' 3 2 'b' 4 3 'c' 5 4 'd' 6 remove = set([2,4])
column_in_set value_2 value_3 1 'a' 3 3 'c' 5
To make the selection you can write:
isin() simply checks if each value of the column/Series is in a set (or list or other iterable), returning a boolean Series.
In this case, we want to only include rows of the DataFrame which are not in remove so we invert the boolean values with ~ and use then this to index the DataFrame.