我有一个带有dtype=numpy.datetime64
的pandas DataFrame
在我要更改的数据中
'2011-11-14T00:00:00.000000000'
为:
'2010-11-14T00:00:00.000000000'
或其他年份。 Timedelta未知,只有分配的年份数。 这显示int中的年份
Dates_profit.iloc[50][stock].astype('datetime64[Y]').astype(int)+1970
但无法分配价值。
任何人都知道如何将年份分配给numpy.datetime64
?
答案 0 :(得分:2)
考虑以下方法:
In [115]: df
Out[115]:
Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05
In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))
In [117]: df
Out[117]:
Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05
答案 1 :(得分:1)
numpy.datetime64
个对象难以使用。要更新值,通常更容易将日期转换为标准Python datetime
对象,进行更改,然后再将其转换回numpy.datetime64
值:
import numpy as np
from datetime import datetime
dt64 = np.datetime64('2011-11-14T00:00:00.000000000')
# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')
# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)
# update year
dt.replace(year=2010)
# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)
可能有更简单的方法,但至少可以这样做。
答案 2 :(得分:0)
这个矢量化解决方案提供了与使用pandas迭代x.replace(year = n)相同的结果,但是大型数组的加速速度至少快了x10。
重要的是要记住替换datetime64对象的年份应该是闰年。使用python datetime库,以下崩溃:datetime(2012,2,29).replace(year = 2011)崩溃。这里,功能' replace_year'只需将2012-02-29移至2011-03-01。
我正在使用numpy v 1.13.1。
import numpy as np
import pandas as pd
def replace_year(x, year):
""" Year must be a leap year for this to work """
# Add number of days x is from JAN-01 to year-01-01
x_year = np.datetime64(str(year)+'-01-01') + (x - x.astype('M8[Y]'))
# Due to leap years calculate offset of 1 day for those days in non-leap year
yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)
# However, due to days in non-leap years prior March-01,
# correct for previous step by removing an extra day
non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')
# Finally, apply the day offset
x_year = x_year - day_offset
return x_year
x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)
x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))
print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))