Python pandas dataframe: find max for each unique values of an another column -


i have large dataframe (from 500k 1m rows) contains example these 3 numeric columns: id, a, b

i want filter results in order obtain table 1 in image below, where, each unique value of column id, have maximum , minimum value of , b. how can do?

edit: have updated image below in order more clear: when max or min column need data associated of others columns

enter image description here

sample data (note posted image can't used potential answerers without retyping, i'm making simple example in place):

df=pd.dataframe({ 'id':[1,1,1,1,2,2,2,2],                    'a':range(8), 'b':range(8,0,-1) }) 

the key using idxmax , idxmin , futzing indexes can merge things in readable way. here's whole answer , may wish examine intermediate dataframes see how working.

df_max = df.groupby('id').idxmax() df_max['type'] = 'max' df_min = df.groupby('id').idxmin() df_min['type'] = 'min'  df2 = df_max.append(df_min).set_index('type',append=true).stack().rename('index')  df3 = pd.concat([ df2.reset_index().drop('id',axis=1).set_index('index'),                    df.loc[df2.values] ], axis=1 )  df3.set_index(['id','level_2','type']).sort_index()                    b id level_2 type       1        max   3  5            min   0  8    b       max   0  8            min   3  5 2        max   7  1            min   4  4    b       max   4  4            min   7  1 

note in particular df2 looks this:

id  type    1   max      3           b    0 2   max      7           b    4 1   min      0           b    3 2   min      4           b    7 

the last column there holds index values in df derived idxmax & idxmin. information need in df2. rest of matter of merging df , making more readable.


Comments

Popular posts from this blog

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

jsf - "PropertyNotWritableException: Illegal Syntax for Set Operation" error when setting value in bean -

arrays - Algorithm to find ideal starting spot in a circle -