python - Pandas dataframe - deltas of data with same ids -


i have dataframe looks this:

  type  unique_id  val 0    x          1   11 1    x          2   12 2    y          1   20 3    y          2   30 

the desired output is

  type  unique_id  val delta 0    x          1   11 9 1    x          2   12 18 2    y          1   20 0 3    y          2   30 0 

namely, want match every x y has same unique_id (the id unique among xs, , separately unique among ys). then, want calculate difference of val each x, , respective y row. ys, value can 0.

assuming unique_id in fact unique give type, can group on based on data filtered type y.

gb = df[df.type == 'y'].groupby('unique_id').first() >>> gb           type  val unique_id           1            y   20 2            y   30 

you join original dataframe:

df = (df.set_index('unique_id')         .join(gb, rsuffix='_')) >>> df           type  val type_  val_ unique_id                       1            x   11     y    20 1            y   20     y    20 2            x   12     y    30 2            y   30     y    30 

you can calculate delta:

df['delta'] = df.val_ - df.val 

finally, reshape data desired form:

df = (df.reset_index()         .sort('type')         .drop(['val_', 'type_'], axis='columns')  # reorder columns. >>> df[['type', 'unique_id', 'val', 'delta']]   type  unique_id  val  delta 0    x          1   11      9 2    x          2   12     18 1    y          1   20      0 3    y          2   30      0 

Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -