我有两种格式的日期值(6/13/2018和6-13-2018)。我不得不计算日期差异。以下是我的工作。
问题:少数项目的出现天数不正确。
X['Date of Closing'] = X['Date of Closing'].str.replace('/','-')
X['Date of First Contact'] = X['Date of First Contact'].str.replace('/','-')
X['Date Difference'] = (pd.to_datetime(X['Date of Closing'])- pd.to_datetime(X['Date of First Contact'])).dt.days
示例:
Date of First Contact | Date of Giving Proposal | Date of Closing \
0 13-01-2014 26-02-2014 26-02-2014
1 28-01-2014 2/2/2014 2-2-2014
2 11-1-2014 26-01-2014 26-01-2014
3 18-01-2014 18-01-2014 18-01-2014
4 14-01-2014 14-01-2014 14-01-2014
5 5-1-2014 14-01-2014 14-01-2014
输出:
44 - 正确
5 - 正确
-279 - 不正确
0 - 正确
0 - 正确
-107 - 不正确
答案 0 :(得分:1)
我认为需要参数dayfirst=True
或format
:
X['Date Difference'] = (pd.to_datetime(X['Date of Closing'], dayfirst=True)-
pd.to_datetime(X['Date of First Contact'], dayfirst=True)).dt.days
X['Date Difference'] = (pd.to_datetime(X['Date of Closing'], format='%d-%m-%Y')-
pd.to_datetime(X['Date of First Contact'], format='%d-%m-%Y')).dt.days
print (X)
Date of First Contact Date of Giving Proposal Date of Closing \
0 13-01-2014 26-02-2014 26-02-2014
1 28-01-2014 2/2/2014 2-2-2014
2 11-1-2014 26-01-2014 26-01-2014
3 18-01-2014 18-01-2014 18-01-2014
4 14-01-2014 14-01-2014 14-01-2014
5 5-1-2014 14-01-2014 14-01-2014
Date Difference
0 44
1 5
2 15
3 0
4 0
5 9