数据框有2个日期属于“对象”数据类型。 StartDate和EndDate的格式为mm / dd / yyyy。
Name StartDate EndDate
bou1 1/9/2017 1/10/2017
bou2 12/31/2016 1/10/2017
输出:
Name StartDate EndDate Diff
bou1 1/9/2017 1/10/2017 1
bou2 12/31/2016 1/10/2017 10
任何建议将不胜感激!
答案 0 :(得分:1)
您首先需要将这些列转换为日期时间,然后减去。
尝试
df['startDate'] = pd.to_datetime(df['startDate'])
df['EndDate'] = pd.to_datetime(df['EndDate'])
df['difInDate'] = (abs(df['startDate'].sub(df['EndDate'], axis = 0))) / np.timedelta64(1, 'D')
print(df['difInDate'])
abs
只是为了使日子变得乐观,因为您是从小日期减去大日期
或者,您也可以使用(df['EndDate'].sub(df['StartDate']
答案 1 :(得分:0)
# Recreating your dataframe with dates stored as strings
df = pd.DataFrame({'Name' : ['bou1', 'bou2'],
'StartDate': ['01/09/2017','12/31/2016'],
'EndDate' : ['01/10/2017', '01/10/2017']})
# Date strings converted with pd.Datetime
df['StartDate'] = pd.to_datetime(df['StartDate'])
df['EndDate'] = pd.to_datetime(df['EndDate'])
# .dt handles your calculation and .days outputs in days
df['Diff'] = (df['EndDate'] - df['StartDate']).dt.days
# Just prints the columns in your order
df[['Name', 'StartDate', 'EndDate', 'Diff']]