如何在numpy datetime64中更改年份值?

时间:2017-07-14 12:14:02

标签: python pandas numpy

我有一个带有dtype=numpy.datetime64的pandas DataFrame 在我要更改的数据中

'2011-11-14T00:00:00.000000000'

为:

'2010-11-14T00:00:00.000000000'

或其他年份。 Timedelta未知,只有分配的年份数。 这显示int中的年份

Dates_profit.iloc[50][stock].astype('datetime64[Y]').astype(int)+1970

但无法分配价值。 任何人都知道如何将年份分配给numpy.datetime64

3 个答案:

答案 0 :(得分:2)

考虑以下方法:

In [115]: df
Out[115]:
        Date
0 2000-01-01
1 2001-02-02
2 2002-03-03
3 2003-04-04
4 2004-05-05

In [116]: df.loc[:, 'Date'] = df['Date'].apply(lambda x: x.replace(year=1999))

In [117]: df
Out[117]:
        Date
0 1999-01-01
1 1999-02-02
2 1999-03-03
3 1999-04-04
4 1999-05-05

答案 1 :(得分:1)

numpy.datetime64个对象难以使用。要更新值,通常更容易将日期转换为标准Python datetime对象,进行更改,然后再将其转换回numpy.datetime64值:

import numpy as np
from datetime import datetime

dt64 = np.datetime64('2011-11-14T00:00:00.000000000')

# convert to timestamp:
ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')

# standard utctime from timestamp
dt = datetime.utcfromtimestamp(ts)

# update year
dt.replace(year=2010)

# convert back to numpy.datetime64:
dt64 = np.datetime64(dt)

可能有更简单的方法,但至少可以这样做。

答案 2 :(得分:0)

这个矢量化解决方案提供了与使用pandas迭代x.replace(year = n)相同的结果,但是大型数组的加速速度至少快了x10。

重要的是要记住替换datetime64对象的年份应该是闰年。使用python datetime库,以下崩溃:datetime(2012,2,29).replace(year = 2011)崩溃。这里,功能' replace_year'只需将2012-02-29移至2011-03-01。

我正在使用numpy v 1.13.1。

import numpy as np
import pandas as pd

def replace_year(x, year):
    """ Year must be a leap year for this to work """
    # Add number of days x is from JAN-01 to year-01-01 
    x_year = np.datetime64(str(year)+'-01-01') +  (x - x.astype('M8[Y]'))

    # Due to leap years calculate offset of 1 day for those days in non-leap year
    yr_mn = x.astype('M8[Y]') + np.timedelta64(59,'D')
    leap_day_offset = (yr_mn.astype('M8[M]') - yr_mn.astype('M8[Y]') - 1).astype(np.int)

    # However, due to days in non-leap years prior March-01, 
    # correct for previous step by removing an extra day
    non_leap_yr_beforeMarch1 = (x.astype('M8[D]') - x.astype('M8[Y]')).astype(np.int) < 59
    non_leap_yr_beforeMarch1 = np.logical_and(non_leap_yr_beforeMarch1, leap_day_offset).astype(np.int)
    day_offset = np.datetime64('1970') - (leap_day_offset - non_leap_yr_beforeMarch1).astype('M8[D]')

    # Finally, apply the day offset 
    x_year = x_year - day_offset
    return x_year


x = np.arange('2012-01-01', '2014-01-01', dtype='datetime64[h]')
x_datetime = pd.to_datetime(x)

x_year = replace_year(x, 1992)
x_datetime = x_datetime.map(lambda x: x.replace(year=1992))

print(x)
print(x_year)
print(x_datetime)
print(np.all(x_datetime.values == x_year))