python - pandas - filter dataframe by another dataframe by row elements -
i have dataframe df1 looks like:
   c  k  l 0   1  1   2  b 2  b  2  3  c  2  4  c  2  d   and called df2 like:
   c  l 0   b 1  c    i filter df1 keeping values not in df2. values filter expected (a,b) , (c,a) tuples. far tried apply isin method:
d = df[~(df['l'].isin(dfc['l']) & df['c'].isin(dfc['c']))]   apart seems me complicated, returns:
   c  k  l 2  b  2  4  c  2  d   but i'm expecting:
   c  k  l 0   1  2  b  2  4  c  2  d      
you can efficiently using isin on multiindex constructed desired columns:
keys = ['c', 'l'] i1 = df1.set_index(keys).index i2 = df2.set_index(keys).index df1[~i1.isin(i2)]     i think improves on @ians's similar solution because doesn't assume column type (i.e. work numbers strings).
(above answer edit. following initial answer)
interesting! haven't come across before... solve merging 2 arrays, dropping rows df2 defined. here example, makes use of temporary array:
df1 = pd.dataframe({'c': ['a', 'a', 'b', 'c', 'c'],                     'k': [1, 2, 2, 2, 2],                     'l': ['a', 'b', 'a', 'a', 'd']}) df2 = pd.dataframe({'c': ['a', 'c'],                     'l': ['b', 'a']})  # create column marking df2 values df2['marker'] = 1  # join two, keeping of df1's indices joined = pd.merge(df1, df2, on=['c', 'l'], how='left') joined     # extract desired columns marker nan joined[pd.isnull(joined['marker'])][df1.columns]     there may way without using temporary array, can't think of one. long data isn't huge above method should fast , sufficient answer.


Comments
Post a Comment