熊猫:连接和重新索引数据帧

时间:2017-04-23 21:01:06

标签: python pandas dataframe concatenation reindex

我想使用新索引将两个pandas数据帧合并到一个新的第三个数据帧中。假设我从以下开始:

df = pd.DataFrame(np.ones(25).reshape((5,5)),index = ['A','B','C','D','E'])
df1 = pd.DataFrame(np.ones(25).reshape((5,5))*2,index = ['A','B','C','D','E'])
df[2] = np.nan
df1[3] = np.nan
df[4] = np.nan
df1[4] = np.nan

我希望用最简单的方法来实现以下结果:

NewIndex    OldIndex    df  df1
1   A   1   2
2   B   1   2
3   C   1   2
4   D   1   2
5   E   1   2
6   A   1   2
7   B   1   2
8   C   1   2
9   D   1   2
10  E   1   2
11  A   NaN 2
12  B   NaN 2
13  C   NaN 2
14  D   NaN 2
15  E   NaN 2
16  A   1   NaN
17  B   1   NaN
18  C   1   NaN
19  D   1   NaN
20  E   1   NaN

最好的方法是什么?

1 个答案:

答案 0 :(得分:1)

您必须取消堆叠数据帧,然后重新索引连接的数据帧。

import numpy as np
import pandas as pd
# test data
df = pd.DataFrame(np.ones(25).reshape((5,5)),index = ['A','B','C','D','E'])
df1 = pd.DataFrame(np.ones(25).reshape((5,5))*2,index = ['A','B','C','D','E'])
df[2] = np.nan
df1[3] = np.nan
df[4] = np.nan
df1[4] = np.nan

# unstack tables and concat
newdf = pd.concat([df.unstack(),df1.unstack()], axis=1)
# reset multiindex for level 1
newdf.reset_index(1, inplace=True)
# rename columns
newdf.columns = ['OldIndex','df','df1']
# drop old index
newdf = newdf.reset_index().drop('index',1)
# set index from 1
newdf.index = np.arange(1, len(newdf) + 1)
# rename new index
newdf.index.name='NewIndex'
print(newdf)

输出:

         OldIndex   df  df1
NewIndex                   
1               A  1.0  2.0
2               B  1.0  2.0
3               C  1.0  2.0
4               D  1.0  2.0
5               E  1.0  2.0
6               A  1.0  2.0
7               B  1.0  2.0
8               C  1.0  2.0
9               D  1.0  2.0
10              E  1.0  2.0
11              A  NaN  2.0
12              B  NaN  2.0
13              C  NaN  2.0
14              D  NaN  2.0
15              E  NaN  2.0
16              A  1.0  NaN
17              B  1.0  NaN
18              C  1.0  NaN
19              D  1.0  NaN
20              E  1.0  NaN
21              A  NaN  NaN
22              B  NaN  NaN
23              C  NaN  NaN
24              D  NaN  NaN
25              E  NaN  NaN