Pandas:Dataframe to Panel错误:NotImplementedError:仅支持2级MultiIndex

时间:2016-04-09 16:48:21

标签: python pandas ipython panel

我在发送数据框to_Panel时遇到问题。我正在对数据进行预先操作,并担心这些可能会导致问题。

merge.head()

      date     catcode  type    di  cid     feccandid   amount  disposition     bills
0   2005-12-31  G1100   24K     D   N00004045   H2MI11042   1500    support     1
1   2005-12-31  L1100   24K     D   N00004045   H2MI11042   8000    support     1
2   2005-12-31  L1100   24K     D   N00004155   H2MI02066   1000    oppose  1
3   2005-12-31  T1200   24K     D   N00004166   H4MI03045   3000    support     1

然后我形成一个pivot_table

mm = merge.pivot_table(index=['date', 'feccandid', 'disposition', \
'bills', 'cid', 'di', 'type'], columns='catcode',values='amount', \
                       fill_value=0)

                                                            catcode     A0000   A1000   A1100   A1200   A1300   A1400   A1500   A1600   A2000   A2300   ...     T9100   T9400   X3700   X4000   X4100   X4110   X5000   X7000   Y0000   Z5200
date       feccandid    disposition     bills       cid     di  type                                                                                    
2005-12-31  H2MI02066   oppose             1    N00004155   D   24K     0   0   0   0   0   0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0
            H2MI11042   support            1    N00004045   D   24K     0   0   0   0   0   0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0
            H4MI03045   support            1    N00004166   D   24K     0   0   0   0   0   0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0

3 rows × 315 columns

然后我重置索引:

mm = mm.reset_index()
mm.head()

catcode     date    feccandid   disposition     bills   cid     di  type    A0000   A1000   A1100   ...     T9100   T9400   X3700   X4000   X4100   X4110   X5000   X7000   Y0000   Z5200
0        2005-12-31 H2MI02066     oppose        1   N00004155   D   24K     0   0   0   ...     0   0   0   0   0   0   0   0   0   0
1        2005-12-31 H2MI11042     support       1   N00004045   D   24K     0   0   0   ...     0   0   0   0   0   0   0   0   0   0
2        2005-12-31 H4MI03045     support       1   N00004166   D   24K     0   0   0   ...     0   0   0   0   0   0   0   0   0   0

然后我发送到csv:

mm.to_csv('i.test', index=False)    

从csv阅读:

hh = pd.read_csv('i.test')

设置索引:

hh.set_index(['date', 'feccandid']).head(3)
                    disposition    bills    cid     di     type     A0000   A1000   A1100   A1200   A1300   ...     T9100   T9400   X3700   X4000   X4100   X4110   X5000   X7000   Y0000   Z5200
date        feccandid                                                                                   
2005-12-31  H2MI02066   oppose        1   N00004155     D   24K     0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0
            H2MI11042   support       1   N00004045     D   24K     0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0
            H4MI03045   support       1   N00004166     D   24K     0   0   0   0   0   ...     0   0   0   0   0   0   0   0   0   0

面板:

hh.to_panel()

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-86-9358192e71a3> in <module>()
----> 1 hh.to_panel()

/home/jayaramdas/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in to_panel(self)
   1210         if (not isinstance(self.index, MultiIndex) or  # pragma: no cover
   1211                 len(self.index.levels) != 2):
-> 1212             raise NotImplementedError('Only 2-level MultiIndex are supported.')
   1213 
   1214         if not self.index.is_unique:

NotImplementedError: Only 2-level MultiIndex are supported.

任何想法,问题或批评?

1 个答案:

答案 0 :(得分:2)

set_index不会发生,因此您的hh没有MultiIndex作为索引。

>>> hh.to_panel()
Traceback (most recent call last):
  File "<ipython-input-4-9358192e71a3>", line 1, in <module>
    hh.to_panel()
  File "/home/dsm/sys/pys/3.5.1/lib/python3.5/site-packages/pandas/core/frame.py", line 1224, in to_panel
    raise NotImplementedError('Only 2-level MultiIndex are supported.')
NotImplementedError: Only 2-level MultiIndex are supported.

>>> hh.set_index(["date", "feccandid"]).to_panel()
<class 'pandas.core.panel.Panel'>
Dimensions: 20 (items) x 1 (major_axis) x 3 (minor_axis)
Items axis: catcode to Z5200
Major_axis axis: 2005-12-31 to 2005-12-31
Minor_axis axis: H2MI02066 to H4MI03045

您可以将inplace=True添加到set_index,但只需更新hh = hh.set_index(...)就可以了。{/ p>

除此之外:我认为Panel正逐渐被弃用,以支持更丰富的xarray N-d对象,因此您可能需要考虑安装xarray然后再进行

>>> hh.to_xarray()
<xarray.Dataset>
Dimensions:      (date: 1, feccandid: 3)
Coordinates:
  * date         (date) object '2005-12-31'
  * feccandid    (feccandid) object 'H2MI02066' 'H2MI11042' 'H4MI03045'
Data variables:
    catcode      (date, feccandid) int64 0 1 2
    disposition  (date, feccandid) object 'oppose' 'support' 'support'
    bills        (date, feccandid) int64 1 1 1
    cid          (date, feccandid) object 'N00004155' 'N00004045' 'N00004166'
    di           (date, feccandid) object 'D' 'D' 'D'
    type         (date, feccandid) object '24K' '24K' '24K'
    [...]
而是以这种方式进行试验。