在for循环中使用pandas date_range时发生ValueError

时间:2018-12-28 07:34:06

标签: python-3.x pandas for-loop date-range

我有一个原始数据集,其中包含有关其开始日期,结束日期,总价值的数据的项目。我通过将总价值除以天数来计算出日价值(dagwaarde)。

我想用数据1-1-2018到1-1-2020填充数据集,其中包含每天所有项目的全天值之和。

除了此作品外,所有作品均有效:

df_date_range = pd.date_range(begin,einde)
出现此错误的

ValueError:NaTType不支持时间

这是我使用的代码:

#Original DF with data about projects: startdatum (start), einddatum (end), dagwaarde (day value). 
#Day value is total value ('Value') / amount of days

Pipedrive['einddatum'] = pd.to_datetime(Pipedrive['einddatum'])
Pipedrive['startdatum'] = pd.to_datetime(Pipedrive['startdatum'])

Pipedrive['Days'] = Pipedrive['einddatum'].sub(Pipedrive['startdatum'], axis =0)
Pipedrive.head()
Pipedrive['Days'] = Pipedrive['Days'] / np.timedelta64(1, 'D')
Pipedrive['dagwaarde'] = Pipedrive['Value'] / Pipedrive['Days']

#Create DF to work with 
Pipedrive_IN = Pipedrive[["stage_order_nr","dagwaarde",'einddatum', 'startdatum', 'Days' ]]

#make a list of all begin and end dates you want to have filled 
begin = '2018-01-01'  # start date
einde = '2020-01-01'  # end date

#make a DF with a timedate index 
datetimeindex = pd.date_range(begin,einde)
df_dates = pd.DataFrame(datetimeindex, columns=['date'])
df_dates = df_dates.set_index('date')
df_dates = df_dates.fillna(0)

for index, value in Pipedrive_IN.iterrows():
    begin = value.startdatum  # start date
    einde = value.einddatum  # end date
    dagwaarde = value.dagwaarde # dagwaarde

    #make DF with timedate index 
    df_date_range = pd.date_range(begin,einde)
    df_proj = pd.DataFrame(df_date_range, columns=['date'])
    df_proj['dagwaarde'] = dagwaarde
    df_proj = df_proj.set_index('date')
    df_proj=df_proj.dropna()
    print(df_proj.head())

    #add original DF to df_dates
    df_dates = df_dates.join(df_proj,lsuffix='', rsuffix=index)
    df_dates = df_dates.fillna(0)
    print(df_dates.head(20))

#print result
df_dates['total']=df_dates.sum(axis=1)
print(df_dates.head(50))

0 个答案:

没有答案