尝试创建一个新列,该列是日期[购买日期]与另一个包含月数[Mainte3]的列之和。
df['Purchase date'] = pd.to_datetime(df['Purchase date'], format='%m/%d/%Y').dt.strftime('%Y-%m-%d') #pass column to a date and then change format
df['New Date'] = df.apply(lambda x: x['Purchase date'] + pd.DateOffset(months = x['Mainte3']), axis=1)
df["Purchase date"].dtypes
object
df["Mainte3"].dtypes
float64
表具有以下格式: table snip
但是我遇到一个错误:
if any(x is not None and x != int(x) for x in (years, months)):
ValueError: ('cannot convert float NaN to integer', 'occurred at index 0')
欢迎任何帮助。谢谢。
答案 0 :(得分:0)
可能存在缺少值的问题,因此请用DataFrame.notna
用DataFrame.all
过滤掉所有不丢失的行,以测试是否所有值都是每行True
s
rng = pd.date_range('2017-04-03', periods=10)
df = pd.DataFrame({'Purchase date': rng, 'Mainte3': range(10)})
df.iloc[0, 1] = np.nan
df.iloc[3, 0] = np.nan
print (df)
Purchase date Mainte3
0 2017-04-03 NaN
1 2017-04-04 1.0
2 2017-04-05 2.0
3 NaT 3.0
4 2017-04-07 4.0
5 2017-04-08 5.0
6 2017-04-09 6.0
7 2017-04-10 7.0
8 2017-04-11 8.0
9 2017-04-12 9.0
#with real data remove strftime
#df['Purchase date'] = pd.to_datetime(df['Purchase date'], format='%m/%d/%Y')
mask = df[['Purchase date','Mainte3']].notna().all(axis=1)
df.loc[mask, 'New Date'] = df[mask].apply(lambda x: x['Purchase date'] +
pd.DateOffset(months = x['Mainte3']), axis=1)
print (df)
Purchase date Mainte3 New Date
0 2017-04-03 NaN NaT
1 2017-04-04 1.0 2017-05-04
2 2017-04-05 2.0 2017-06-05
3 NaT 3.0 NaT
4 2017-04-07 4.0 2017-08-07
5 2017-04-08 5.0 2017-09-08
6 2017-04-09 6.0 2017-10-09
7 2017-04-10 7.0 2017-11-10
8 2017-04-11 8.0 2017-12-11
9 2017-04-12 9.0 2018-01-12