ValueError:无法将DatetimeIndex强制转换为dtype datetime64 [us]

时间:2016-07-22 00:08:31

标签: python postgresql pandas

我试图为S& P 500 ETF创建30分钟数据的PostgreSQL表 (spy30new,用于测试新插入的数据)来自几个15分钟数据的股票表(全部15)。 all15有一个指数' dt' (时间戳)和' instr' (股票代码)。我希望spy30new能够在' dt'。

import numpy as np
import pandas as pd
from datetime import datetime, date, time, timedelta
from dateutil import parser
from sqlalchemy import create_engine

# Query all15
engine = create_engine('postgresql://user:passwd@localhost:5432/stocks')
new15Df = (pd.read_sql_query("SELECT dt, o, h, l, c, v FROM all15 WHERE (instr = 'SPY') AND (date(dt) BETWEEN '2016-06-27' AND '2016-07-15');", engine)).sort_values('dt')
# Correct for Time Zone.
new15Df['dt'] = (new15Df['dt'].copy()).apply(lambda d: d + timedelta(hours=-4))

# spy0030Df contains the 15-minute data at 00 & 30 minute time points
# spy1545Df contains the 15-minute data at 15 & 45 minute time points
spy0030Df = (new15Df[new15Df['dt'].apply(lambda d: d.minute % 30) == 0]).reset_index(drop=True)
spy1545Df = (new15Df[new15Df['dt'].apply(lambda d: d.minute % 30) == 15]).reset_index(drop=True)

high = pd.concat([spy1545Df['h'], spy0030Df['h']], axis=1).max(axis=1)
low = pd.concat([spy1545Df['l'], spy0030Df['l']], axis=1).min(axis=1)
volume = spy1545Df['v'] + spy0030Df['v']

# spy30Df assembled and pushed to PostgreSQL as table spy30new
spy30Df = pd.concat([spy0030Df['dt'], spy1545Df['o'], high, low, spy0030Df['c'], volume], ignore_index = True, axis=1)
spy30Df.columns = ['d', 'o', 'h', 'l', 'c', 'v']
spy30Df.set_index(['dt'], inplace=True)
spy30Df.to_sql('spy30new', engine, if_exists='append', index_label='dt')

这会给出错误" ValueError:无法将DatetimeIndex强制转换为dtype datetime64 [us]"

  1. 未在'dt'

    spy30Df.set_index(['dt'], inplace=True)  # Remove this line
    spy30Df.to_sql('spy30new', engine, if_exists='append')  # Delete the index_label option
  2. 转换' dt'从类型pandas.tslib.Timestamp到datetime.datetime使用to_pydatetime() (如果psycopg2可以使用python dt,但不能使用pandas Timestamp)

    u = (spy0030Df['dt']).tolist()
    timesAsPyDt = np.asarray(map((lambda d: d.to_pydatetime()), u))
    spy30Df = pd.concat([spy1545Df['o'], high, low, spy0030Df['c'], volume], ignore_index = True, axis=1)
    newArray = np.c_[timesAsPyDt, spy30Df.values]
    colNames = ['dt', 'o', 'h', 'l', 'c', 'v']
    newDf = pd.DataFrame(newArray, columns=colNames)
    newDf.set_index(['dt'], inplace=True)
    newDf.to_sql('spy30new', engine, if_exists='append', index_label='dt')
  3. 使用datetime.utcfromtimestamp()

    timesAsDt = (spy0030Df['dt']).apply(lambda d: datetime.utcfromtimestamp(d.tolist()/1e9))
  4. 使用pd.to_datetime()

    timesAsDt = pd.to_datetime(spy0030Df['dt'])

3 个答案:

答案 0 :(得分:7)

对每个元素使用pd.to_datetime()。选项4不起作用,将pd.to_datetime()应用于整个系列。也许Postgres的驱动程序理解python datetime,但不是大熊猫的日期时间64。 numpy的。选项4产生了正确的输出,但是在将DF发送到Postgres


答案 1 :(得分:4)


                              Biomass  Fossil Brown coal/Lignite  Fossil Coal-derived gas  Fossil Gas  Fossil Hard coal  Fossil Oil  Geothermal  Hydro Pumped Storage  Hydro Run-of-river and poundage  Hydro Water Reservoir  Nuclear   Other  Other renewable    Solar  Waste  Wind Offshore  Wind Onshore
2018-02-02 00:00:00+01:00   4835.0                    16275.0                    446.0      1013.0            4071.0       155.0         5.0                   7.0                           1906.0                   35.0   8924.0  3643.0            142.0      0.0  595.0         2517.0       19999.0
2018-02-02 00:15:00+01:00   4834.0                    16272.0                    446.0      1010.0            3983.0       155.0         5.0                   7.0                           1908.0                   71.0   8996.0  3878.0            142.0      0.0  594.0         2364.0       19854.0
2018-02-02 00:30:00+01:00   4828.0                    16393.0                    446.0      1019.0            4015.0       155.0         5.0    


df.reset_index(level=0, inplace=True)  

使用此代码将列名“ index”重命名为“ DateTime”。

df = df.rename(columns={'index': 'DateTime'})

将数据类型更改为“ datetime64”。

df['DateTime'] = df['DateTime'].astype('datetime64')


engine = create_engine('mysql+mysqlconnector://root:Password@localhost/generation_data', echo=True)
df.to_sql(con=engine, name='test', if_exists='replace')

答案 2 :(得分:3)


(df['Time']).apply(lambda d: pd.to_datetime(str(d)))





t = pd.to_datetime(df['Time'])
t = t.tz_localize(None)
