python - Most efficient method to combine pandas DataFrames which have the same column value -


for example, have 2 dataframe contain identical sample name different feature data.

i want compare how many samples existed in both dataframe.

data here

df1 df2

a dummy way achieve problem have though about:

hit = 0 in range(0,len(df1),1):     j in range(0,len(df2),1):         if df1.sample_name.iloc[i] == df2.sample_name.iloc[j]:            hit+=1 

i thouth loop procedure may waste lot of time. there simple technology takcle with?

beside, how extract subset of each dataframe idential sample_name , connect feature data new dataframe.

i have tried pd.concat(df1, df2, keys = 'sample_name')

here's vectorized approach using numpy broadcasting hit value -

np.count_nonzero(df1.sample_name.values[:,none] == df2.sample_name.values) 

Comments

Popular posts from this blog

php - How to display all orders for a single product showing the most recent first? Woocommerce -

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

angularjs - How restrict admin panel using in backend laravel and admin panel on angular? -