熊猫:在数据框中为日期添加不同的天数

时间:2015-09-15 12:55:36

标签: python datetime pandas

我有一个带有日期列的数据框,然后是我要添加到该列的天数。我想用结果值创建一个新列“Recency_Date”。

DF:

   fan         Community Name  Count  Mean_Days   Date_Min
0   855              AAA Games      6        353 2013-04-16
1   855  First Person Shooters      2        420 2012-10-16
2   855            Playstation      3        108 2014-06-12
3  3148              AAA Games      1          0 2015-04-17
4  3148          Mobile Gaming      1          0 2013-01-19

df info:

merged.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4627415 entries, 0 to 4627414
Data columns (total 5 columns):
fan               int64
Community Name    object
Count             int64
Mean_Days         int32
Date_Min          datetime64[ns]
dtypes: datetime64[ns](1), int32(1), int64(2), object(1)
memory usage: 194.2+ MB

示例数据为csv:

fan,Community Name,Count,Mean_Days,Date_Min
855,AAA Games,6,353,2013-04-16 00:00:00
855,First Person Shooters,2,420,2012-10-16 00:00:00
855,Playstation,3,108,2014-06-12 00:00:00
3148,AAA Games,1,0,2015-04-17 00:00:00
3148,Mobile Gaming,1,0,2013-01-19 00:00:00
3148,Power PCs,2,0,2014-06-17 00:00:00
3148,XBOX,1,0,2009-11-12 00:00:00
3860,AAA Games,1,0,2012-11-28 00:00:00
3860,Minecraft,3,393,2011-09-07 00:00:00
4044,AAA Games,5,338,2010-11-15 00:00:00
4044,Blizzard Games,1,0,2013-07-12 00:00:00
4044,Geek Culture,1,0,2011-06-03 00:00:00
4044,Indie Games,2,112,2013-01-09 00:00:00
4044,Minecraft,1,0,2014-01-02 00:00:00
4044,Professional Gaming,1,0,2014-01-02 00:00:00
4044,XBOX,2,785,2010-11-15 00:00:00
4827,AAA Games,1,0,2010-08-24 00:00:00
4827,Gaming Humour,1,0,2012-05-05 00:00:00
4827,Minecraft,2,10,2012-03-21 00:00:00
5260,AAA Games,4,27,2013-09-17 00:00:00
5260,Indie Games,8,844,2011-06-08 00:00:00
5260,MOBA,2,0,2012-10-27 00:00:00
5260,Minecraft,5,106,2012-02-17 00:00:00
5260,XBOX,1,0,2011-06-15 00:00:00
5484,AAA Games,21,1296,2009-08-01 00:00:00
5484,Free to Play,1,0,2014-12-08 00:00:00
5484,Indie Games,1,0,2014-05-28 00:00:00
5484,Music Games,1,0,2012-09-12 00:00:00
5484,Playstation,1,0,2012-02-22 00:00:00

我试过了:

merged['Recency_Date'] = merged['Date_Min'] + timedelta(days=merged['Mean_Days'])

merged['Recency_Date'] = pd.DatetimeIndex(merged['Date_Min']) + pd.DateOffset(merged['Mean_Days'])

但是我很难找到适用于Series而不是单个int值的东西。任何和所有的帮助将非常感谢。

1 个答案:

答案 0 :(得分:2)

如果'Date_Min'dtype已经是日期时间,那么您可以从'Mean_Days'列构建Timedeltaindex并添加以下内容:

In [174]:
df = pd.DataFrame({'Date_Min':[dt.datetime.now(), dt.datetime(2015,3,4), dt.datetime(2011,6,9)], 'Mean_Days':[1,2,3]})
df

Out[174]:
                    Date_Min  Mean_Days
0 2015-09-15 14:02:37.452369          1
1 2015-03-04 00:00:00.000000          2
2 2011-06-09 00:00:00.000000          3

In [175]:
df['Date_Min'] + pd.TimedeltaIndex(df['Mean_Days'], unit='D')

Out[175]:
0   2015-09-16 14:02:37.452369
1   2015-03-06 00:00:00.000000
2   2011-06-12 00:00:00.000000
Name: Date_Min, dtype: datetime64[ns]