以特定方式使用pandas / numpy重塑数据框-将多列转换为两列

时间:2020-03-07 02:02:53

标签: python pandas algorithm numpy

给出以下数据:

df = pd.DataFrame(
    dict(
        x1=["zero", "one", "two"],
        x2=["three", "four", "five"],
        x3=["six", "seven", "eight"],
        x4=["nine", "ten", "eleven"],
    )
)

其外观为:

In [2]: df
Out[2]:
     x1     x2     x3      x4
0  zero  three    six    nine
1   one   four  seven     ten
2   two   five  eight  eleven

我想将其重塑为以下内容

x1      x2
zero    three
one     four
two     five
three   six
four    seven
five    eight
six     nine
seven   ten
eight   eleven

以下方法有效,但我不认为方法是正确的:

c1 = df.columns[: df.shape[1] - 1]
c2 = df.columns[1:]
d1 = df.loc[:, c1].T.values.flatten()
d2 = df.loc[:, c2].T.values.flatten()
pd.DataFrame(dict(x1=d1, x2=d2))

1 个答案:

答案 0 :(得分:1)

尝试将np.vstackiloc进行切片以理解列表:

df_new = (pd.DataFrame(np.vstack([df.iloc[:,i:i+2].to_numpy()
                                   for i in range(df.shape[1]-1)]),
                      columns=['x1', 'x2']))

[出]

      x1      x2
0   zero   three
1    one    four
2    two    five
3  three     six
4   four   seven
5   five   eight
6    six    nine
7  seven     ten
8  eight  eleven