python - pandas - filter dataframe by another dataframe by row elements -
i have dataframe df1
looks like:
c k l 0 1 1 2 b 2 b 2 3 c 2 4 c 2 d
and called df2
like:
c l 0 b 1 c
i filter df1
keeping values not in df2
. values filter expected (a,b)
, (c,a)
tuples. far tried apply isin
method:
d = df[~(df['l'].isin(dfc['l']) & df['c'].isin(dfc['c']))]
apart seems me complicated, returns:
c k l 2 b 2 4 c 2 d
but i'm expecting:
c k l 0 1 2 b 2 4 c 2 d
you can efficiently using isin
on multiindex constructed desired columns:
keys = ['c', 'l'] i1 = df1.set_index(keys).index i2 = df2.set_index(keys).index df1[~i1.isin(i2)]
i think improves on @ians's similar solution because doesn't assume column type (i.e. work numbers strings).
(above answer edit. following initial answer)
interesting! haven't come across before... solve merging 2 arrays, dropping rows df2
defined. here example, makes use of temporary array:
df1 = pd.dataframe({'c': ['a', 'a', 'b', 'c', 'c'], 'k': [1, 2, 2, 2, 2], 'l': ['a', 'b', 'a', 'a', 'd']}) df2 = pd.dataframe({'c': ['a', 'c'], 'l': ['b', 'a']}) # create column marking df2 values df2['marker'] = 1 # join two, keeping of df1's indices joined = pd.merge(df1, df2, on=['c', 'l'], how='left') joined
# extract desired columns marker nan joined[pd.isnull(joined['marker'])][df1.columns]
there may way without using temporary array, can't think of one. long data isn't huge above method should fast , sufficient answer.
Comments
Post a Comment