从两个其他列之间的函数计数天创建新的结果列

时间:2017-11-21 11:48:03

标签: python pandas numpy

我正在计算存储在两列中的两个日期之间的天数,但只计算从五月到八月的天数(树木的生长季节)并填充一个新列:

import pandas as pd
import numpy as np
from datetime import datetime

df = pd.DataFrame(columns=['Start','End'],data=[[np.datetime64('2001-01-01'),np.datetime64('2001-07-01')],[np.datetime64('2001-01-01'),np.datetime64('2001-11-01')]])

def vegetation_days(date1, date2):
    startdate=date1.astype(datetime)
    enddate=date2.astype(datetime)
    all_dates = (startdate + datetime.timedelta(days=x) for x in range(0, (enddate-startdate).days))
    return (sum(1 for date in all_dates if (5 <= date.month <=7)))

然后:

df:

       Start        End
0 2001-01-01 2001-07-01
1 2001-01-01 2001-11-01

df['Days'] = vegetation_days(df['Start'],df['End'])

这给了我错误:

  

AttributeError:'Series'对象没有属性'days'

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

使用DataFrame.apply

def vegetation_days(date1, date2):
    all_dates = (date1 + pd.Timedelta(days=x) for x in range(0, (date2-date1).days))
    return (sum(1 for date in all_dates if (5 <= date.month <=7)))

df['Days'] = df.apply(lambda x: vegetation_days(x['Start'], x['End']), axis=1)
print (df)
       Start        End  Days
0 2001-01-01 2001-07-01    61
1 2001-01-01 2001-11-01    92