创建一个新的列,其中包含月份到日期的总和

时间:2019-03-21 09:39:00

标签: python pandas dataframe

尝试创建一个新列,该列是日期[购买日期]与另一个包含月数[Mainte3]的列之和。

df['Purchase date'] = pd.to_datetime(df['Purchase date'], format='%m/%d/%Y').dt.strftime('%Y-%m-%d') #pass column to a date and then change format

df['New Date'] = df.apply(lambda x: x['Purchase date'] + pd.DateOffset(months = x['Mainte3']), axis=1)

df["Purchase date"].dtypes
object
df["Mainte3"].dtypes
float64

表具有以下格式: table snip

但是我遇到一个错误:

    if any(x is not None and x != int(x) for x in (years, months)):
ValueError: ('cannot convert float NaN to integer', 'occurred at index 0')

欢迎任何帮助。谢谢。

1 个答案:

答案 0 :(得分:0)

可能存在缺少值的问题,因此请用DataFrame.notnaDataFrame.all过滤掉所有不丢失的行,以测试是否所有值都是每行True s

rng = pd.date_range('2017-04-03', periods=10)
df = pd.DataFrame({'Purchase date': rng, 'Mainte3': range(10)})  
df.iloc[0, 1] = np.nan
df.iloc[3, 0] = np.nan
print (df)
  Purchase date  Mainte3
0    2017-04-03      NaN
1    2017-04-04      1.0
2    2017-04-05      2.0
3           NaT      3.0
4    2017-04-07      4.0
5    2017-04-08      5.0
6    2017-04-09      6.0
7    2017-04-10      7.0
8    2017-04-11      8.0
9    2017-04-12      9.0

#with real data remove strftime 
#df['Purchase date'] = pd.to_datetime(df['Purchase date'], format='%m/%d/%Y')

mask = df[['Purchase date','Mainte3']].notna().all(axis=1)
df.loc[mask, 'New Date'] = df[mask].apply(lambda x: x['Purchase date'] + 
                                          pd.DateOffset(months = x['Mainte3']), axis=1)
print (df)
  Purchase date  Mainte3   New Date
0    2017-04-03      NaN        NaT
1    2017-04-04      1.0 2017-05-04
2    2017-04-05      2.0 2017-06-05
3           NaT      3.0        NaT
4    2017-04-07      4.0 2017-08-07
5    2017-04-08      5.0 2017-09-08
6    2017-04-09      6.0 2017-10-09
7    2017-04-10      7.0 2017-11-10
8    2017-04-11      8.0 2017-12-11
9    2017-04-12      9.0 2018-01-12