Pandas将行与不同的行名称组合到新的df中

时间:2016-01-05 10:37:08

标签: python-3.x pandas dataframe

我关注df1

         A1       A2        A3        A4       B1        B2        B3      B4  \
3  0.202425  0.13495  0.202425  0.202425  0.94465  0.877175  0.877175  0.8097   

        C1      C2   ...           F3        F4        G1        G2        G3  \
3  1.21455  1.3495   ...     4.925676  4.318401  5.330526  5.600426  5.802851   

         G4        H1        H2       H3       H4  
3  5.398001  0.202425  0.067475  0.13495  0.13495  

[1 rows x 32 columns]

我想创建这样的东西:

       A1       A2        A3        A4
0.202425  0.13495   0.202425  0.202425
0.94465   0.877175  0.877175  0.8097
1.21455   1.3495    1.282025  1.282025
2.429101  2.496576  2.429101  2.429101    
3.441226  3.846076  3.643651  3.643651
4.723251  4.858201  4.925676  4.925676
5.330526  5.600426  5.802851  5.802851
0.202425  0.067475  0.13495   0.13495

这是我的代码:

a_cols = [c for c in df1.columns if c.startswith('A')]
b_cols = [c for c in df1.columns if c.startswith('B')]
c_cols = [c for c in df1.columns if c.startswith('C')]
d_cols = [c for c in df1.columns if c.startswith('D')]
e_cols = [c for c in df1.columns if c.startswith('E')]
f_cols = [c for c in df1.columns if c.startswith('F')]
g_cols = [c for c in df1.columns if c.startswith('G')]
h_cols = [c for c in df1.columns if c.startswith('H')]

col_dict = dict(zip(a_cols, b_cols,c_cols,d_cols,e_cols,f_cols,g_cols,h_cols))
l=pd.concat([df1.loc[:, a_cols], df1.loc[:, b_cols], df1.loc[:, c_cols], df1.loc[:, d_cols], df1.loc[:, e_cols], df1.loc[:, f_cols], df1.loc[:, g_cols].df1.loc[:, h_cols].rename(columns=col_dict)])
print (l)

不知何故,我无法将多个列表压缩...

1 个答案:

答案 0 :(得分:0)

也许你可以在valuesreshape使用numpy数组。

如果列length4,则可以使用它。

print df
#         A1       A2        A3        A4       B1        B2        B3      B4  \
#0  0.202425  0.13495  0.202425  0.202425  0.94465  0.877175  0.877175  0.8097   
#
#        F1      F2        F3        F4        G1        G2        G3  \
#0  1.21455  1.3495  4.925676  4.318401  5.330526  5.600426  5.802851   
#
#         G4        H1        H2       H3       H4  
#0  5.398001  0.202425  0.067475  0.13495  0.13495  


print (df.values).reshape((5, 4))

#[[ 0.202425  0.13495   0.202425  0.202425]
# [ 0.94465   0.877175  0.877175  0.8097  ]
# [ 1.21455   1.3495    4.925676  4.318401]
# [ 5.330526  5.600426  5.802851  5.398001]
# [ 0.202425  0.067475  0.13495   0.13495 ]]

print (df.values).shape[1]
#20

print pd.DataFrame((df.values).reshape(((df.values).shape[1]/4, 4)), 
                                       columns=['A1','A2','A3','A4'])

#         A1        A2        A3        A4
#0  0.202425  0.134950  0.202425  0.202425
#1  0.944650  0.877175  0.877175  0.809700
#2  1.214550  1.349500  4.925676  4.318401
#3  5.330526  5.600426  5.802851  5.398001
#4  0.202425  0.067475  0.134950  0.134950