Python pandas dataframe: find max for each unique values of an another column -
i have large dataframe (from 500k 1m rows) contains example these 3 numeric columns: id, a, b
i want filter results in order obtain table 1 in image below, where, each unique value of column id, have maximum , minimum value of , b. how can do?
edit: have updated image below in order more clear: when max or min column need data associated of others columns
sample data (note posted image can't used potential answerers without retyping, i'm making simple example in place):
df=pd.dataframe({ 'id':[1,1,1,1,2,2,2,2], 'a':range(8), 'b':range(8,0,-1) })
the key using idxmax
, idxmin
, futzing indexes can merge things in readable way. here's whole answer , may wish examine intermediate dataframes see how working.
df_max = df.groupby('id').idxmax() df_max['type'] = 'max' df_min = df.groupby('id').idxmin() df_min['type'] = 'min' df2 = df_max.append(df_min).set_index('type',append=true).stack().rename('index') df3 = pd.concat([ df2.reset_index().drop('id',axis=1).set_index('index'), df.loc[df2.values] ], axis=1 ) df3.set_index(['id','level_2','type']).sort_index() b id level_2 type 1 max 3 5 min 0 8 b max 0 8 min 3 5 2 max 7 1 min 4 4 b max 4 4 min 7 1
note in particular df2 looks this:
id type 1 max 3 b 0 2 max 7 b 4 1 min 0 b 3 2 min 4 b 7
the last column there holds index values in df
derived idxmax
& idxmin
. information need in df2
. rest of matter of merging df
, making more readable.
Comments
Post a Comment