python - Pandas dataframe - deltas of data with same ids -
i have dataframe looks this:
type unique_id val 0 x 1 11 1 x 2 12 2 y 1 20 3 y 2 30 the desired output is
type unique_id val delta 0 x 1 11 9 1 x 2 12 18 2 y 1 20 0 3 y 2 30 0 namely, want match every x y has same unique_id (the id unique among xs, , separately unique among ys). then, want calculate difference of val each x, , respective y row. ys, value can 0.
assuming unique_id in fact unique give type, can group on based on data filtered type y.
gb = df[df.type == 'y'].groupby('unique_id').first() >>> gb type val unique_id 1 y 20 2 y 30 you join original dataframe:
df = (df.set_index('unique_id') .join(gb, rsuffix='_')) >>> df type val type_ val_ unique_id 1 x 11 y 20 1 y 20 y 20 2 x 12 y 30 2 y 30 y 30 you can calculate delta:
df['delta'] = df.val_ - df.val finally, reshape data desired form:
df = (df.reset_index() .sort('type') .drop(['val_', 'type_'], axis='columns') # reorder columns. >>> df[['type', 'unique_id', 'val', 'delta']] type unique_id val delta 0 x 1 11 9 2 x 2 12 18 1 y 1 20 0 3 y 2 30 0
Comments
Post a Comment