python - pandas - filter dataframe by another dataframe by row elements -


i have dataframe df1 looks like:

   c  k  l 0   1  1   2  b 2  b  2  3  c  2  4  c  2  d 

and called df2 like:

   c  l 0   b 1  c  

i filter df1 keeping values not in df2. values filter expected (a,b) , (c,a) tuples. far tried apply isin method:

d = df[~(df['l'].isin(dfc['l']) & df['c'].isin(dfc['c']))] 

apart seems me complicated, returns:

   c  k  l 2  b  2  4  c  2  d 

but i'm expecting:

   c  k  l 0   1  2  b  2  4  c  2  d 

you can efficiently using isin on multiindex constructed desired columns:

keys = ['c', 'l'] i1 = df1.set_index(keys).index i2 = df2.set_index(keys).index df1[~i1.isin(i2)] 

enter image description here

i think improves on @ians's similar solution because doesn't assume column type (i.e. work numbers strings).


(above answer edit. following initial answer)

interesting! haven't come across before... solve merging 2 arrays, dropping rows df2 defined. here example, makes use of temporary array:

df1 = pd.dataframe({'c': ['a', 'a', 'b', 'c', 'c'],                     'k': [1, 2, 2, 2, 2],                     'l': ['a', 'b', 'a', 'a', 'd']}) df2 = pd.dataframe({'c': ['a', 'c'],                     'l': ['b', 'a']})  # create column marking df2 values df2['marker'] = 1  # join two, keeping of df1's indices joined = pd.merge(df1, df2, on=['c', 'l'], how='left') joined 

enter image description here

# extract desired columns marker nan joined[pd.isnull(joined['marker'])][df1.columns] 

enter image description here

there may way without using temporary array, can't think of one. long data isn't huge above method should fast , sufficient answer.


Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -