更新熊猫(0.23.4)和matplotlib(3.01)之后,尝试执行类似以下操作时遇到奇怪的错误:
import pandas as pd
import matplotlib.pyplot as plt
clrdict = {1: "#a6cee3", 2: "#1f78b4", 3: "#b2df8a", 4: "#33a02c"}
df_full = pd.DataFrame({'x':[20,30,30,40],
'y':[25,20,30,25],
's':[100,200,300,400],
'l':[1,2,3,4]})
df_full['c'] = df_full['l'].replace(clrdict)
df_part = df_full[(df_full.x == 30)]
fig = plt.figure()
plt.scatter(x=df_full['x'],
y=df_full['y'],
s=df_full['s'],
c=df_full['c'])
plt.show()
fig = plt.figure()
plt.scatter(x=df_part['x'],
y=df_part['y'],
s=df_part['s'],
c=df_part['c'])
plt.show()
显示原始DataFrame(df_full)的散点图没有问题。但是部分DataFrame的图会引发以下错误:
Traceback (most recent call last):
File "G:\data\project\test.py", line 27, in <module>
c=df_part['c'])
File "C:\Program Files\Python37\lib\site-packages\matplotlib\pyplot.py", line 2864, in scatter
is not None else {}), **kwargs)
File "C:\Program Files\Python37\lib\site-packages\matplotlib\__init__.py", line 1805, in inner
return func(ax, *args, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\matplotlib\axes\_axes.py", line 4195, in scatter
isinstance(c[0], str))):
File "C:\Program Files\Python37\lib\site-packages\pandas\core\series.py", line 767, in __getitem__
result = self.index.get_value(self, key)
File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexes\base.py", line 3118, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
这归因于颜色选项c=df_part['c']
。当您忽略它时-不会发生问题。在更新之前还没有发生过这种情况,因此也许您无法使用较低版本的matplotlib或pandas重现它(我不知道是哪个原因引起的)。
在我的项目中,df_part = df_full[(df_full.x == i)]
行在matplotlib.animation.FuncAnimation
的更新功能中使用。结果是动画超过x的值(这是我项目中的时间戳)。所以我需要一种分割DataFrame的方法。
答案 0 :(得分:1)
这是一个由https://github.com/matplotlib/matplotlib/pull/12673修复的错误。
hopefully应该在下一个Bugfix版本3.0.2中可用,该版本应该在接下来的几天内发布。
同时,您可以使用熊猫系列series.values
中的numpy数组。