python - Pandas dataframe - deltas of data with same ids -
i have dataframe looks this:
type unique_id val 0 x 1 11 1 x 2 12 2 y 1 20 3 y 2 30
the desired output is
type unique_id val delta 0 x 1 11 9 1 x 2 12 18 2 y 1 20 0 3 y 2 30 0
namely, want match every x
y
has same unique_id (the id unique among x
s, , separately unique among y
s). then, want calculate difference of val
each x
, , respective y
row. y
s, value can 0.
assuming unique_id in fact unique give type, can group on based on data filtered type y
.
gb = df[df.type == 'y'].groupby('unique_id').first() >>> gb type val unique_id 1 y 20 2 y 30
you join original dataframe:
df = (df.set_index('unique_id') .join(gb, rsuffix='_')) >>> df type val type_ val_ unique_id 1 x 11 y 20 1 y 20 y 20 2 x 12 y 30 2 y 30 y 30
you can calculate delta:
df['delta'] = df.val_ - df.val
finally, reshape data desired form:
df = (df.reset_index() .sort('type') .drop(['val_', 'type_'], axis='columns') # reorder columns. >>> df[['type', 'unique_id', 'val', 'delta']] type unique_id val delta 0 x 1 11 9 2 x 2 12 18 1 y 1 20 0 3 y 2 30 0
Comments
Post a Comment