鉴于date列,我想创建另一个列diff,该列比较从第一个日期算起有多少天。
date diff
2011-01-01 00:00:10 0
2011-01-01 00:00:11 0.000011 days
2011-02-01 00:00:11 30.000011 days
2013-02-01 00:00:11 395.000011 days
2014-02-01 00:00:11 760.000011 days
日期为日期时间。到目前为止,我尝试过的事情:
df = df.sort_values(['date'], ascending=True)
df.set_index('date', inplace = True)
first = df.index[0]
df['diff'] = (first - df.index.shift()).fillna(0)
答案 0 :(得分:1)
您可以尝试
df['diff'] = df.date - df.date.min()
df
date diff
0 2011-01-01 00:00:10 0 days 00:00:00
1 2011-01-01 00:00:11 0 days 00:00:01
2 2011-02-01 00:00:11 31 days 00:00:01
3 2013-02-01 00:00:11 762 days 00:00:01
4 2014-02-01 00:00:11 1127 days 00:00:01
答案 1 :(得分:0)
您可以使用这种方法而无需设置新索引
原始数据框
df
date diff
0 2011-01-01 00:00:10 0.000000
1 2011-01-01 00:00:11 0.000011
2 2011-02-01 00:00:11 30.000011
3 2013-02-01 00:00:11 395.000011
4 2014-02-01 00:00:11 760.000011
可能的答案
df['diff_new'] = df['date'] - df.loc[0,'date']
date diff diff_new
0 2011-01-01 00:00:10 0.000000 0 days 00:00:00
1 2011-01-01 00:00:11 0.000011 0 days 00:00:01
2 2011-02-01 00:00:11 30.000011 31 days 00:00:01
3 2013-02-01 00:00:11 395.000011 762 days 00:00:01
4 2014-02-01 00:00:11 760.000011 1127 days 00:00:01
顺便说一句,我在第三行的原始数据中看到了不同的日期差。您可以与this online tool to calculate date differences in days进行手动比较。
答案 2 :(得分:0)
这是您尝试的。
>>> df
date
0 2011-01-01 00:00:10
1 2011-01-01 00:00:11
2 2011-02-01 00:00:11
3 2013-02-01 00:00:11
4 2014-02-01 00:00:11
首先将它们转换为时间戳,以便可以正确构造数据。转换后,只需将DataFrame进行差分即可
>>> df2 = df.apply(lambda x: [pd.Timestamp(ts) for ts in x])
>>> df['diff'] = (df2 - df2.shift()).fillna(0)
>>> df
date diff
0 2011-01-01 00:00:10 0 days 00:00:00
1 2011-01-01 00:00:11 0 days 00:00:01
2 2011-02-01 00:00:11 31 days 00:00:00
3 2013-02-01 00:00:11 731 days 00:00:00
4 2014-02-01 00:00:11 365 days 00:00:00
答案 3 :(得分:0)
这就是我要获取天作为浮点数值的方法:
dates = pd.to_datetime(df.date) # make sure we are working with dates and not strings
df["diff"] = (dates - dates[0]).apply(lambda x: x.total_seconds() / 86400))
产生的df
:
date diff
0 2011-01-01 00:00:10 0.000000
1 2011-01-01 00:00:11 0.000012
2 2011-02-01 00:00:11 31.000012
3 2013-02-01 00:00:11 762.000012
4 2014-02-01 00:00:11 1127.000012