使用pandas,python 3(但不是2)读取h5文件时“已经tz感知”错误

时间:2016-09-22 13:46:41

标签: python python-3.x pandas timezone hdf5

我有一个名为weather.h5的h5商店。我的默认Python环境是3.5.2。当我尝试阅读这家商店时,我得到TypeError: Already tz-aware, use tz_convert to convert

我已经尝试了pd.read_hdf('weather.h5','weather_history')pd.io.pytables.HDFStore('weather.h5')['weather_history],但无论如何我都会收到错误。

我可以在Python 2.7环境中打开h5。这是Python 3 / pandas中的错误吗?

1 个答案:

答案 0 :(得分:0)

我有同样的问题。我正在使用Anaconda Python:3.4.5和2.7.3。两者都在使用pandas 0.18.1。

这是一个可重复的例子:

generate.py(用Python2执行):

import pandas as pd
from pandas import HDFStore

index = pd.DatetimeIndex(['2017-06-20 06:00:06.984630-05:00', '2017-06-20 06:03:01.042616-05:00'], dtype='datetime64[ns, CST6CDT]', freq=None)
p1 = [0, 1]
p2 = [0, 2]

# Saving any of these dataframes cause issues
df1 = pd.DataFrame({"p1":p1, "p2":p2}, index=index)
df2 = pd.DataFrame({"p1":p1, "p2":p2, "i":index})

store = HDFStore("./test_issue.h5")
store['df'] = df1
#store['df'] = df2
store.close()

read_issue.py:

import pandas as pd
from pandas import HDFStore

store = HDFStore("./test_issue.h5", mode="r")
df = store['/df']
store.close()

print(df)

在Python2中运行read_issue.py没有问题并产生此输出:

p1  p2
     

2017-06-20 11:00:06.984630-05:00 0 0   2017-06-20 11:03:01.042616-05:00 1 2

但是在Python3中运行它会产生这个回溯的错误:

  

追踪(最近一次通话):     文件“read_issue.py”,第5行,in       df = store ['df']     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第417行, getitem       return self.get(key)     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第634行,获取       return self._read_group(group)     在_read_group中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第1272行       返回s.read(** kwargs)     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第2779行,正在阅读       ax = self.read_index('axis%d'%i)     在read_index中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第2367行       _,index = self.read_index_node(getattr(self.group,key))     在read_index_node中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第2492行       _unconvert_index(data,kind,encoding = self.encoding),** kwargs)     在中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/indexes/base.py”,第153行       result = DatetimeIndex(data,copy = copy,name = name,** kwargs)     在包装器中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/util/decorators.py”,第91行       return func(* args,** kwargs)     在中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/tseries/index.py”,第321行       引发TypeError(“已经知道tz,使用tz_convert”   TypeError:已经识别tz,使用tz_convert进行转换。   关闭剩余的打开文件:./ test_issue.h5 ...已完成

因此,索引存在问题。但是,如果在generate.py中保存df2(datetime作为列而不是索引),则read_issue.py中的Python3会产生不同的错误:

  

追踪(最近一次通话):     文件“read_issue.py”,第5行,in       df = store ['/ df']     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第417行, getitem       return self.get(key)     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第634行,获取       return self._read_group(group)     在_read_group中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第1272行       返回s.read(** kwargs)     文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/io/pytables.py”,第2788行,正在阅读       放置= items.get_indexer(blk_items))     在make_block中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/core/internals.py”,第2518行       return klass(values,ndim = ndim,fastpath = fastpath,placement = placement)     在 init 中输入文件“/home/denper/anaconda3/envs/py34/lib/python3.4/site-packages/pandas/core/internals.py”,第90行       LEN(self.mgr_locs)))   ValueError:传递的项目数量错误2,展示位置意味着1   关闭剩余的打开文件:./ test_issue.h5 ...已完成

另外,如果你在Python3中执行generate_issue.py(保存df1或df2),那么在Python3或Python2中执行read_issue.py都没有问题