我有2个数据帧
date sitename Auto_name AutoCount
2012-05-01 chess.com Autobiographer 8
2012-05-05 chess.com Autobiographer 1
2012-05-15 chess.com Autobiographer 3
并且
date sitename Stu_name Student count
2012-05-01 chess.com Student 4
2012-05-02 chess.com Student 2
输出应该如何
date sitename Autoname AutoCount Stu_name Stu_count
2012-05-01 chess.com Autobiographer 8 Student 4
2012-05-02 chess.com Autobiographer 0 Student 2
2012-05-05 chess.com Autobiographer 1 Student 0
2012-05-15 chess.com Autobiographer 3 Student 0
我想从第二个到第一个插入名称和学生计数,但是根据日期列。它看起来并不那么困难,但我无法弄清楚这一点。
答案 0 :(得分:0)
您可以使用merge
函数(请参阅合并数据框的文档:http://pandas.pydata.org/pandas-docs/stable/merging.html)。假设您的数据框名为df1
和df2
:
In [13]: df = pd.merge(df1, df2, how='outer')
In [14]: df
Out[14]:
date sitename Auto_name AutoCount Stu_name StudentCount
0 2012-05-01 chess.com Autobiographer 8 Student 4
1 2012-05-05 chess.com Autobiographer 1 NaN NaN
2 2012-05-15 chess.com Autobiographer 3 NaN NaN
3 2012-05-02 chess.com NaN NaN Student 2
上面使用公共列进行合并(在本例中为date
和sitename
),但您也可以使用on
关键字指定列(请参阅{{3} })。
在下一步中,您可以根据需要填充NaN值。按照示例输出,可以是:
In [15]: df.fillna({'Auto_name':'Autobiographer', 'AutoCount':0, 'Stu_name':'Student', 'StudentCount':0})
Out[15]:
date sitename Auto_name AutoCount Stu_name StudentCount
0 2012-05-01 chess.com Autobiographer 8 Student 4
1 2012-05-05 chess.com Autobiographer 1 Student 0
2 2012-05-15 chess.com Autobiographer 3 Student 0
3 2012-05-02 chess.com Autobiographer 0 Student 2