如何在Dataframe中转置(堆栈)任意列?

时间:2014-09-24 10:00:29

标签: python numpy pandas

我将使用此Dataframe作为示例:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(3, 6), 
                  columns=['a', 'b', 'c', '2010', '2011', '2012'])

导致此数据:

          a         b         c      2010      2011      2012
0 -2.161845 -0.995818 -0.225338  0.107255 -1.114179  0.701679
1  1.083428 -1.473900  0.890769 -0.937312  0.781201 -0.043237
2 -1.187588  0.241896  0.465302 -0.194004  0.921763 -1.359859

现在我想将列'2010','2011'和'2012'转置(堆叠)成行,以便能够获得:

        a         b         c 
-2.161845 -0.995818 -0.225338 2010  0.107255
 1.083428 -1.473900  0.890769 2010 -0.937312
-1.187588  0.241896  0.465302 2010 -0.194004
-2.161845 -0.995818 -0.225338 2011 -1.114179
 1.083428 -1.473900  0.890769 2011  0.781201
-1.187588  0.241896  0.465302 2011  0.921763
-2.161845 -0.995818 -0.225338 2012  0.701679
 1.083428 -1.473900  0.890769 2012 -0.043237
-1.187588  0.241896  0.465302 2012 -1.359859

通过使用df.stack() pandas将所有列“堆叠”成行,而我想只堆叠那些指向的列。所以我的问题是如何将任意列转换为pandas Dataframe中的行?

1 个答案:

答案 0 :(得分:3)

您应该使用pandas.melt

import numpy as np
import pandas as pd

# Note I've changed it from random numbers to integers as I 
# find it easier to read and see the differences :)
df = pd.DataFrame(np.arange(18).reshape((3,6)), 
                  columns=['a', 'b', 'c', '2010', '2011', '2012'])

var = ['a', 'b', 'c']
melted = pd.melt(df, id_vars=var)

print(melted)
#     a   b   c variable  value
# 0   0   1   2     2010      3
# 1   6   7   8     2010      9
# 2  12  13  14     2010     15
# 3   0   1   2     2011      4
# 4   6   7   8     2011     10
# 5  12  13  14     2011     16
# 6   0   1   2     2012      5
# 7   6   7   8     2012     11
# 8  12  13  14     2012     17