python - Sorting within each subgroup and summing first three values -

- June 15, 2014

i have pandas data frame , there 3 columns, state_name, county_name, population. population numeric data. question want answer looking @ 3 populous counties each state, 3 populous states. think first need groupby state_name , county_name. can that. after confused how proceed. new pandas, guidance help

here's dummy data (please include sample of data in future).

state_name,county_name,population state1,state1_a,100 state1,state1_b,8000 state1,state1_c,75 state1,state1_d,876 state1,state1_e,2938 state2,state2_a,200 state2,state2_b,16000 state2,state2_c,75 state2,state2_d,876 state2,state2_e,5876

let's set index state_name , county_name, , select 'population' column return multiindexed pandas.series

df = pd.read_clipboard() # have done index_col=[0,1] here df = df.set_index(['state_name','county_name']) s = df.population

now can series.groupby , use nlargest on (wouldn't work on dataframe, that's why use series):

s.groupby(level='state_name').nlargest(3)  state_name  state_name  county_name state1      state1      state1_b        8000                         state1_e        2938                         state1_d         876 state2      state2      state2_b       16000                         state2_e        5876                         state2_d         876 name: population, dtype: int64

Search This Blog

Swift

python - Sorting within each subgroup and summing first three values -

Comments

Post a Comment

Popular posts from this blog

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

jsf - "PropertyNotWritableException: Illegal Syntax for Set Operation" error when setting value in bean -

arrays - Algorithm to find ideal starting spot in a circle -