从字典制作数据框

时间:2018-10-23 05:40:57

标签: python python-3.x pandas dataframe

我有这样的数据:

>>> cc
defaultdict(<class 'dict'>, {1540272960: {'max': 1.14614, 'to': 1540273020, 'close': 1.14606, 'from': 1540272960, 'open': 1.145935, 'volume': 96, 'id': 366597, 'min': 1.14593, 'at': 1540273020040554921}, 1540273020: {'active_id': 1, 'to': 1540273080, 'ask': 1.14622, 'open': 1.14606, 'max_at': 1540273034, 'size': 60, 'max': 1.146135, 'at': 1540273040013821491, 'min_at': 1540273020, 'close': 1.146095, 'from': 1540273020, 'volume': 42, 'bid': 1.14597, 'id': 366598, 'min': 1.14606}})

我尝试使用pandas将其转换为行和列格式:

>>> df = pd.DataFrame(cc)
>>> df
             1540273080    1540273140
active_id  1.000000e+00  1.000000e+00
ask        1.146160e+00  1.146160e+00
at         1.540273e+18  1.540273e+18
bid        1.145910e+00  1.145910e+00
close      1.146035e+00  1.146035e+00
from       1.540273e+09  1.540273e+09
id         3.665990e+05  3.666000e+05
max        1.146100e+00  1.146055e+00
max_at     1.540273e+09  1.540273e+09
min        1.146030e+00  1.146035e+00
min_at     1.540273e+09  1.540273e+09
open       1.146080e+00  1.146040e+00
size       6.000000e+01  6.000000e+01
to         1.540273e+09  1.540273e+09
volume     9.500000e+01  9.000000e+00

我得到了:

>>> df.index
Index(['active_id', 'ask', 'at', 'bid', 'close', 'from', 'id', 'max', 'max_at',
       'min', 'min_at', 'open', 'size', 'to', 'volume'],
      dtype='object')

>>> df["volume"]
Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python35\lib\site-packages\pandas\core\indexes\base.py", line 3078, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: 'volume'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python35\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "C:\Python35\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Python35\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "C:\Python35\lib\site-packages\pandas\core\internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "C:\Python35\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: 'volume'

但是这些值以顶点数据帧的形式出现。我希望索引应该将键和值相应地放置在相应的列中。我该怎么办?

2 个答案:

答案 0 :(得分:2)

使用DataFrame.from_dict

df = pd.DataFrame.from_dict(cc, orient='index')
print (df)
                 max          to     close        from      open  volume  \
1540272960  1.146140  1540273020  1.146060  1540272960  1.145935      96   
1540273020  1.146135  1540273080  1.146095  1540273020  1.146060      42   

                id      min                   at  active_id      ask  \
1540272960  366597  1.14593  1540273020040554921        NaN      NaN   
1540273020  366598  1.14606  1540273040013821491        1.0  1.14622   

                  max_at  size        min_at      bid  
1540272960           NaN   NaN           NaN      NaN  
1540273020  1.540273e+09  60.0  1.540273e+09  1.14597  

@Anton vBR的另一个想法是T使用转置:

df = pd.DataFrame(cc).T

答案 1 :(得分:1)

或类似于@jezrael的第二个,但使用transopse

df = pd.DataFrame(cc).transpose()

然后:

print(df)

是:

                 max          to     close        from      open  volume  \
1540272960  1.146140  1540273020  1.146060  1540272960  1.145935      96   
1540273020  1.146135  1540273080  1.146095  1540273020  1.146060      42   

                id      min                   at  active_id      ask  \
1540272960  366597  1.14593  1540273020040554921        NaN      NaN   
1540273020  366598  1.14606  1540273040013821491        1.0  1.14622   

                  max_at  size        min_at      bid  
1540272960           NaN   NaN           NaN      NaN  
1540273020  1.540273e+09  60.0  1.540273e+09  1.14597  

符合预期