根据时间间隔合并熊猫DataFrame

时间:2019-01-30 10:47:55

标签: python-3.x pandas pandas-groupby

我正在尝试合并两个包含每周计数的数据框df_1和包含每月平均计数的df_2。

下面显示的每周计数(df_1)。

date	              Direct_Connection	    Indirect_Connection
2014-12-07        	36	                  124
2014-12-14        	38	                  111
2014-12-21        	28	                  95
2014-12-28        	15	                  58
2015-01-04        	5	                  9
2015-01-11        	22	                  85
2015-01-18        	49	                  174
2015-01-25        	47	                  171
2015-02-01        	54	                  187
2015-02-08 	        45                        216
2015-02-15 	        29                     	  125
2015-02-22        	22	                  63
2015-03-01        	27	                  96

每月平均(df_2),如下所示。

date	      Direct_Connection   Indirect_Connection
2014-12-31	29.25	              97.0
2015-01-31	30.75	              109.75
2015-02-28	37.5	              147.75
2015-03-31	31.0	              119.0

我想完成的合并将为“直接连接”和“间接连接”都向df_1(每周计数)添加一个“关联的月平均”列

下面是我想要的输出示例。

date	              Direct_Connection	Assoc_dir_avg  Indirect_Connection Ass_indir_avg
2014-12-07        	36	              29.5           124                 97
2014-12-14        	38	              29.5           111                 97
2014-12-21        	28	              29.5           95                  97
2014-12-28        	15	              29.5           58                  97
2015-01-04        	5	              30.5           9                   109.75 
2015-01-11        	22	              30.5           85                  109.75
2015-01-18        	49	              30.5           174                 109.75
2015-01-25        	47	              30.5           171                 109.75
2015-02-01        	54	              37.5           187                 147.75
2015-02-08 	        45                    37.5           216             147.75
2015-02-15 	        29                    37.5     	     125             147.75
2015-02-22        	22	              37.5            63                 147.75
2015-03-01        	27	              31              96                 119

以上DataFrames只是整个DF的简短摘录,跨越了很多年。

任何人都能提供的帮助将不胜感激!

1 个答案:

答案 0 :(得分:0)

使用-

# Cast to date if not done already 
df1['date'] = pd.to_datetime(df1['date'].str.strip(), format='%Y-%m-%d')
df2['date'] = pd.to_datetime(df2['date'].str.strip(), format='%Y-%m-%d')

from pandas.tseries.offsets import MonthEnd

df=df1.merge(df2, left_on=(df1['date'] + MonthEnd(1)), right_on='date').rename(columns={'Direct_Connection_x':'Direct_Connection',
                                                                        'Indirect_Connection_x':'Indirect_Connection',
                                                                        'Direct_Connection_y':'Assoc_dir_avg',
                                                                        'Indirect_Connection_y':'Ass_indir_avg',
                                                                        'date_x':'date'}).drop('date_y', axis=1)

输出

         date                      Direct_Connection  Indirect_Connection  \
0  2014-12-07                                     36                124.0   
1  2014-12-14                                     38                111.0   
2  2014-12-21                                     28                 95.0   
3  2014-12-28                                     15                 58.0   
4  2015-01-04                                      5                  9.0   
5  2015-01-11                                     22                 85.0   
6  2015-01-18                                     49                174.0   
7  2015-01-25                                     47                171.0   
8  2015-02-01                                     54                187.0   
9  2015-02-08          45                        216                  NaN   
10 2015-02-15                29                                     125.0   
11 2015-02-22                                     22                 63.0   
12 2015-03-01                                     27                 96.0   

    Assoc_dir_avg  Ass_indir_avg  
0           29.25          97.00  
1           29.25          97.00  
2           29.25          97.00  
3           29.25          97.00  
4           30.75         109.75  
5           30.75         109.75  
6           30.75         109.75  
7           30.75         109.75  
8           37.50         147.75  
9           37.50         147.75  
10          37.50         147.75  
11          37.50         147.75  
12          31.00         119.00