我正在计算存储在两列中的两个日期之间的天数,但只计算从五月到八月的天数(树木的生长季节)并填充一个新列:
import pandas as pd
import numpy as np
from datetime import datetime
df = pd.DataFrame(columns=['Start','End'],data=[[np.datetime64('2001-01-01'),np.datetime64('2001-07-01')],[np.datetime64('2001-01-01'),np.datetime64('2001-11-01')]])
def vegetation_days(date1, date2):
startdate=date1.astype(datetime)
enddate=date2.astype(datetime)
all_dates = (startdate + datetime.timedelta(days=x) for x in range(0, (enddate-startdate).days))
return (sum(1 for date in all_dates if (5 <= date.month <=7)))
然后:
df:
Start End
0 2001-01-01 2001-07-01
1 2001-01-01 2001-11-01
df['Days'] = vegetation_days(df['Start'],df['End'])
这给了我错误:
AttributeError:'Series'对象没有属性'days'
我该如何解决这个问题?
答案 0 :(得分:1)
def vegetation_days(date1, date2):
all_dates = (date1 + pd.Timedelta(days=x) for x in range(0, (date2-date1).days))
return (sum(1 for date in all_dates if (5 <= date.month <=7)))
df['Days'] = df.apply(lambda x: vegetation_days(x['Start'], x['End']), axis=1)
print (df)
Start End Days
0 2001-01-01 2001-07-01 61
1 2001-01-01 2001-11-01 92