python - Slicing and arranging dataframe in pandas -


i want arrange data data frame multiple dataframes or groups. input data

id  channel path 15  direct  a1 15  direct  a2 15  direct  a3 15  direct  a4 213 paid    b2 213 paid    b1 2222    direct  as25 2222    direct  dw46 2222    direct  32q 3111    paid    d32a 3111    paid    23ff 3111    paid    www32 3111    paid    2d2 

the desired output should like

id  channel p1  p2       213 paid    b2  b2        id  channel p1  p2  p3   2222    direct  as25    dw46    dw46      id  channel p1  p2  p3  p4 15  direct  a1  a2  a3  a4 3111    paid    d32a    23ff    www32   2d2 

please tell way can achieve it. thanks

i think can first create helper column cols cumcount , pivot_table. need find length of notnull columns (substract first 2) , groupby length. last dropna columns in each group:

df['cols'] = 'p' + (df.groupby('id')['id'].cumcount() + 1).astype(str)  df1 = df.pivot_table(index=['id', 'channel'],                      columns='cols',                      values='path',                      aggfunc='first').reset_index().rename_axis(none, axis=1)  print df1      id channel    p1    p2     p3    p4 0    15  direct    a1    a2     a3    a4 1   213    paid    b2    b1   none  none 2  2222  direct  as25  dw46    32q  none 3  3111    paid  d32a  23ff  www32   2d2  print df1.apply(lambda x: x.notnull().sum() - 2 , axis=1) 0    4 1    2 2    3 3    4 dtype: int64  i, g in df1.groupby(df1.apply(lambda x: x.notnull().sum() - 2 , axis=1)):     print     print g.dropna(axis=1) 2     id channel  p1  p2 1  213    paid  b2  b1 3      id channel    p1    p2   p3 2  2222  direct  as25  dw46  32q 4      id channel    p1    p2     p3   p4 0    15  direct    a1    a2     a3   a4 3  3111    paid  d32a  23ff  www32  2d2 

for storing can use dictionary of dataframes:

dfs={i: g.dropna(axis=1)              i, g in df1.groupby(df1.apply(lambda x: x.notnull().sum() - 2 , axis=1))}  #select dataframe len=2     print dfs[2]     id channel  p1  p2 1  213    paid  b2  b1  #select dataframe len=3        print dfs[3]      id channel    p1    p2   p3 2  2222  direct  as25  dw46  32q 

Comments

Popular posts from this blog

Ansible - ERROR! the field 'hosts' is required but was not set -

customize file_field button ruby on rails -

SoapUI on windows 10 - high DPI/4K scaling issue -