Question

我试图根据sort中的值何时发生变化，将pandas df变成columns到单独的columns中。对于下面的df，当columns中的值发生变化时，我可以将df分为单独的Col B。但是我正在尝试添加Col C，以便在Col B和Col C中值都更改的时候。

import pandas as pd

df = pd.DataFrame({
        'A' : [10,20,30,40,40,30,20,10,5,10,15,20,20,15,10,5],
        'B' : ['X','X','X','X','Y','Y','Y','Y','X','X','X','X','Y','Y','Y','Y'],
        'C' : ['W','W','Z','Z','Z','Z','W','W','W','W','Z','Z','Z','Z','W','W'],                                         
        })

d = df['B'].ne(df['B'].shift()).cumsum()
df['C'] =  d.groupby(df['B']).transform(lambda x: pd.factorize(x)[0]).add(1).astype(str)
df['D'] = df.groupby(['B','C']).cumcount()
df = df.set_index(['D','C','B'])['A'].unstack([2,1])
df.columns = df.columns.map(''.join)

输出：

   X1  Y1  X2  Y2
D                
0  10  40   5  20
1  20  30  10  15
2  30  20  15  10
3  40  10  20   5

如您所见，每次column中有一个新值时，都会创建一个新的Col B。但我也尝试合并Col C。因此，应该每次Col B和Col C都发生变化。

预期输出：

   XW1  XZ1  YZ1  YW1  XW2  XZ2  YZ2  YW2
0   10   30   40   20    5   15   20   10
1   20   40   30   10   10   20   15    5

Answer 1

只需根据您的意见一一创建帮助列即可。

df['key']=df.B+df.C# create the key
df['key2']=(df.key!=df.key.shift()).ne(0).cumsum() # make the continue key into one group
df.key2=df.groupby('key').key2.apply(lambda x : x.astype('category').cat.codes+1)# change the group number to 1 or 2 
df['key3']=df.groupby(['key','key2']).cumcount() # create the index for pivot
df['key']=df.key+df.key2.astype(str) # create the columns for pivot

df.pivot('key3','key','A')#yield  
Out[126]: 
key   XW1  XW2  XZ1  XZ2  YW1  YW2  YZ1  YZ2
key3                                        
0      10    5   30   15   20   10   40   20
1      20   10   40   20   10    5   30   15

将pandas df排序为单独的列

1 个答案: