你如何将pytables表读入pandas数据帧

时间:2015-01-13 09:39:41

标签: python pandas pytables

我已经构建了一个pytables表并使用append填充它:

h5file = open_file("FGBS.h5", mode = "a")
group = h5file.create_group("/", 'hybrid')
table = h5file.create_table(group, 'z4', Hybrid ,filters= tb.Filters(5, "blosc"))

使用:

class Hybrid(IsDescription):
    dateTime = Time32Col()
    price = Float64Col()
    quantity = Float64Col()
    bidPrc = Float64Col()
    bidSize = Float64Col()
    askPrc = Float64Col()
    askSize = Float64Col()

并附加到表格中:

           if 0 in dictInstrumentsData[message.symbol].bidPrice and 0 in dictInstrumentsData[message.symbol].askPrice:
                hybrid = table.row
                hybrid["dateTime"] = message.timestamp * 0.001
                hybrid["price"] = message.price
                hybrid["quantity"] = message.size
                hybrid["bidPrc"] = dictInstrumentsData[message.symbol].bidPrice[0]
                hybrid["bidSize"] = dictInstrumentsData[message.symbol].bidSize[0]
                hybrid["askPrc"] = dictInstrumentsData[message.symbol].askPrice[0]
                hybrid["askSize"] = dictInstrumentsData[message.symbol].askSize[0]

                hybrid.append()

现在我想把它读回到像这样的pandas数据框中:

a = tb.open_file("FGBS.h5")
table = a.root.quote.z4
c = pd.DataFrame.from_records(table)

但是当我看到c时,我得到的是:

                                          0       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          1       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          2       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          3       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          4       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          5       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          6       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          7       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          8       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          9       \
0  /hybrid/z4.row (Row), pointing to row #164411   

                       ...                        \
0                      ...                         

                                          164401  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164402  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164403  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164404  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164405  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164406  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164407  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164408  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164409  \
0  /hybrid/z4.row (Row), pointing to row #164411   

                                          164410  
0  /hybrid/z4.row (Row), pointing to row #164411  

[1 rows x 164411 columns]

不是具有基于每个追加的混合列和行的列的数据框。任何人都可以帮我告诉我我做错了什么

1 个答案:

答案 0 :(得分:3)

您需要显式读取表中的数据。 Table.read将引入整个表,Table.read_where允许您应用条件语句来过滤返回的数据。

a = tb.open_file("FGBS.h5")
table = a.root.quote.z4
c = pd.DataFrame.from_records(table.read())