Question

我有以下pandas数据帧

stdate      enddate  count  
2004-01-04  2004-01-10  68  
2004-01-11  2004-01-17  100   
2004-01-18  2004-01-24  83  
2004-01-25  2004-01-31  56    
2004-02-01  2004-02-07  56  
2004-02-08  2004-02-14  68    
2004-02-15  2004-02-21  81  
2004-02-22  2004-02-28  68    
2004-02-29  2004-03-06  76

我想根据月份计算平均值：

我希望它像：

date    count
2004-01 (306/25-4)
2004-02 (349/28-01)

例如第二个月作为enddate 3，（我需要帮助汇总这个使用pandas计算）

Answer 1

它并不复杂，但有一些工作，我认为你应该放弃pandas进行大部分计算，并在最后建立一个数据帧。

假设您有两个datetime个对象，b和e。它们之间的差异是

(e - b).days

这将为您提供一行的计数除以天数。

另外，给定一个月，您可以找到该月的最后一天using the calendar module。

所以，您可以执行以下操作：

counts_per_month = {}
def process_row(b, e, count):
    ...
    # Find how count splits between the months, 
    #    update counts_per_month accordingly

现在致电

df.apply(lambda r: process_row(r.stdate, r.enddate, r.count), axis=1)

此时counts_per_month将包含您的数据。通过致电pd.DataFrame.from_dict结束。

熊猫日期范围和平均计数

1 个答案: