熊猫DataFrame的散点图以KeyError:0结尾

时间:2018-11-06 17:31:01

标签: python pandas matplotlib

更新熊猫(0.23.4)和matplotlib(3.01)之后,尝试执行类似以下操作时遇到奇怪的错误:

import pandas as pd
import matplotlib.pyplot as plt


clrdict = {1: "#a6cee3", 2: "#1f78b4", 3: "#b2df8a", 4: "#33a02c"}

df_full = pd.DataFrame({'x':[20,30,30,40],
                        'y':[25,20,30,25],
                        's':[100,200,300,400],
                        'l':[1,2,3,4]})

df_full['c'] = df_full['l'].replace(clrdict)

df_part = df_full[(df_full.x == 30)]

fig = plt.figure()
plt.scatter(x=df_full['x'],
            y=df_full['y'],
            s=df_full['s'],
            c=df_full['c'])
plt.show()

fig = plt.figure()
plt.scatter(x=df_part['x'],
            y=df_part['y'],
            s=df_part['s'],
            c=df_part['c'])
plt.show()

显示原始DataFrame(df_full)的散点图没有问题。但是部分DataFrame的图会引发以下错误:

Traceback (most recent call last):
  File "G:\data\project\test.py", line 27, in <module>
    c=df_part['c'])
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\pyplot.py", line 2864, in scatter
    is not None else {}), **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\__init__.py", line 1805, in inner
    return func(ax, *args, **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\axes\_axes.py", line 4195, in scatter
    isinstance(c[0], str))):
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\series.py", line 767, in __getitem__
    result = self.index.get_value(self, key)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexes\base.py", line 3118, in get_value
    tz=getattr(series.dtype, 'tz', None))
  File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

这归因于颜色选项c=df_part['c']。当您忽略它时-不会发生问题。在更新之前还没有发生过这种情况,因此也许您无法使用较低版本的matplotlib或pandas重现它(我不知道是哪个原因引起的)。

在我的项目中,df_part = df_full[(df_full.x == i)]行在matplotlib.animation.FuncAnimation的更新功能中使用。结果是动画超过x的值(这是我项目中的时间戳)。所以我需要一种分割DataFrame的方法。

1 个答案:

答案 0 :(得分:1)

这是一个由https://github.com/matplotlib/matplotlib/pull/12673修复的错误。

hopefully应该在下一个Bugfix版本3.0.2中可用,该版本应该在接下来的几天内发布。

同时,您可以使用熊猫系列series.values中的numpy数组。