我有一个带有以下内容的pandas DataFrame:
df =
start end
01/April 02/May
12/April 12/April
我需要在end
和start
值(end - start
)之间添加一个差异(以天为单位)的列。
我该怎么做?
我尝试了以下内容:
import pandas as pd
df.startdate = pd.datetime(df.start, format='%B/%d')
df.enddate = pd.datetime(df.end, format='%B/%d')
但不确定这是否是正确的方向。
答案 0 :(得分:1)
这是一种方式。
df['start'] = pd.to_datetime(df['start']+'/2018', format='%d/%B/%Y')
df['end'] = pd.to_datetime(df['end']+'/2018', format='%d/%B/%Y')
df['diff'] = df['end'] - df['start']
# start end diff
# 0 2018-04-01 2018-05-02 31 days
# 1 2018-04-12 2018-04-12 0 days
答案 1 :(得分:1)
import pandas as pd
df = pd.DataFrame({"start":["01/April", "12/April"], "end": ["02/May", "12/April"]})
df["start"] = pd.to_datetime(df["start"])
df["end"] = pd.to_datetime(df["end"])
df["diff"] = (df["end"] - df["start"])
<强>输出:强>
end start diff
0 2018-05-02 2018-04-01 31 days
1 2018-04-12 2018-04-12 0 days