pandas合并了两个数据帧

时间:2017-09-04 10:48:27

标签: python pandas dataframe

我是pandas模块的新手。关于pandas合并方法,我有一个小问题。假设我有两个单独的表,如下所示:

Original_DataFrame

machine weekNum Percent
 M1        2      75
 M1        5      80
 M1        8      95
 M1       10      90

New_DataFrame

machine weekNum Percent
 M1        1      100
 M1        2      100
 M1        3      100
 M1        4      100
 M1        5      100
 M1        6      100
 M1        7      100
 M1        8      100
 M1        9      100
 M1       10      100

我使用了pandas模块的merge方法,如下:

pd.merge(orig_df, new_df, on='weekNum', how='left')

我得到如下:

    machine    weekNum  Percent_x  Percent_y
 0    M1           2      75         100
 1    M1           5      80         100
 2    M1           8      95         100
 3    M1          10      90         100

但是,我希望填写跳过的weekNums并为这些行添加100以获得所需的输出,如下所示。

machine weekNum Percent
 M1        1      100
 M1        2      75
 M1        3      100
 M1        4      100
 M1        5      80
 M1        6      100
 M1        7      100
 M1        8      95
 M1        9      100
 M1       10      90

有人可以指示我如何继续吗?

3 个答案:

答案 0 :(得分:1)

我认为您需要combine_first,但需要通过常见列首先set_index

df11 = df1.set_index(['machine','weekNum'])
df22 = df2.set_index(['machine','weekNum'])

df = df11.combine_first(df22).astype(int).reset_index()
print (df)
  machine  weekNum  Percent
0      M1        1      100
1      M1        2       75
2      M1        3      100
3      M1        4      100
4      M1        5       80
5      M1        6      100
6      M1        7      100
7      M1        8       95
8      M1        9      100
9      M1       10       90


df.plot.bar('weekNum', 'Percent')

graph

编辑:

对于标签:

plt.figure(figsize=(12, 8))
ax = df.plot.bar('weekNum', 'Percent')
rects = ax.patches

for rect, label in zip(rects, df['Percent']):
    height = rect.get_height()
    ax.text(rect.get_x() + rect.get_width()/2, height + 1, label, ha='center', va='bottom')

plt.ylim(ymax=120)

graph2

答案 1 :(得分:0)

不像其他解决方案那样优雅,但无论如何都有效:

# join
merged = pd.merge(data1, data2, on=['machine','weekNum'], how='outer')
# combine percent columns
merged['Percent'] = merged['Percent_x'].fillna(merged['Percent_y'])
# remove extra columns
result = merged[['machine','weekNum', 'Percent']]

结果:

machine weekNum Percent
M1  2   75
M1  5   80
M1  8   95
M1  10  90
M1  1   100
M1  3   100
M1  4   100
M1  6   100
M1  7   100
M1  9   100

答案 2 :(得分:0)

你可以试试这个。根据您的总体目标,这可能不是"程序设计"足够。

mouseout