在数据帧上使用for循环绘制直方图时的KeyError

时间:2018-02-08 21:43:04

标签: python pandas matplotlib histogram keyerror

我的数据框类似于:

df = pd.DataFrame({'Date': ['2016-01-05', '2016-01-05', '2016-01-05', '2016-01-05', '2016-01-08', '2016-01-08', '2016-02-01'], 'Count': [1, 2, 2, 3, 2, 0, 2]})

我试图为每个唯一Count

绘制Date的直方图

我试过了:

for date in df.Date.unique(): 
    plt.hist([df[df.Date == '%s' %(date)]['Count']])
    plt.title('%s' %(date))

导致

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-17-971a1cf07250> in <module>()
      1 for date in df.Date.unique():
----> 2     plt.hist([df[df.Date == '%s' %(date)]['Count']])
      3     plt.title('%s' %(date))

c:~\anaconda3\lib\site-packages\matplotlib\pyplot.py in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, data, **kwargs)
   2963                       histtype=histtype, align=align, orientation=orientation,
   2964                       rwidth=rwidth, log=log, color=color, label=label,
-> 2965                       stacked=stacked, data=data, **kwargs)
   2966     finally:
   2967         ax.hold(washold)

c:~\anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1816                     warnings.warn(msg % (label_namer, func.__name__),
   1817                                   RuntimeWarning, stacklevel=2)
-> 1818             return func(ax, *args, **kwargs)
   1819         pre_doc = inner.__doc__
   1820         if pre_doc is None:

c:~\anaconda3\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5925 
   5926         # basic input validation
-> 5927         flat = np.ravel(x)
   5928 
   5929         input_empty = len(flat) == 0

c:~\anaconda3\lib\site-packages\numpy\core\fromnumeric.py in ravel(a, order)
   1482         return asarray(a).ravel(order=order)
   1483     else:
-> 1484         return asanyarray(a).ravel(order=order)
   1485 
   1486 

c:~\anaconda3\lib\site-packages\numpy\core\numeric.py in asanyarray(a, dtype, order)
    581 
    582     """
--> 583     return array(a, dtype, copy=False, order=order, subok=True)
    584 
    585 

c:~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    581         key = com._apply_if_callable(key, self)
    582         try:
--> 583             result = self.index.get_value(self, key)
    584 
    585             if not lib.isscalar(result):

c:~\anaconda3\lib\site-packages\pandas\indexes\base.py in get_value(self, series, key)
   1978         try:
   1979             return self._engine.get_value(s, k,
-> 1980                                           tz=getattr(series.dtype, 'tz', None))
   1981         except KeyError as e1:
   1982             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas\index.pyx in pandas.index.IndexEngine.get_value (pandas\index.c:3332)()

pandas\index.pyx in pandas.index.IndexEngine.get_value (pandas\index.c:3035)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()

pandas\hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas\hashtable.c:6610)()

pandas\hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas\hashtable.c:6554)()

KeyError: 0

但是当我尝试简单地打印它时,没有问题:

for date in df.Date.unique(): 
    print([df[df.Date == '%s' %(date)]['Count']])

[0    1
1    2
2    2
3    3
Name: Count, dtype: int64]
[4    2
5    0
Name: Count, dtype: int64]
[6    2
Name: Count, dtype: int64]

在我的数据框架上调用plt.hist的问题与我在这里的方式有什么关系?

2 个答案:

答案 0 :(得分:1)

您正在传递数据框列表,这会导致问题。您可以解构groupby个对象并分别绘制每个对象。

gps = df.groupby('Date').Count
_, axes = plt.subplots(nrows=gps.ngroups)

for (_, g), ax in zip(df.groupby('Date').Count, axes):
    g.plot.hist(ax=ax)

plt.show()

enter image description here

如果图表中需要更多糖,请查看可视化文档。

答案 1 :(得分:1)

基本上你的代码中有两个方括号太多了。

<div class="form-group">
        @Html.LabelFor(model => model.paid, new { @class = "control-label col-md-2" })*
        <div class="col-md-10">
            @Html.DropDownListFor(model => model.paidTextID, Enum.GetValues(typeof(EnumClass.Paid)).Cast<EnumClass.Paid>().Select(x => new SelectListItem { Text = x.ToString(), Value = ((int)x).ToString() }), new { style = "width: 500px" })
            @Html.ValidationMessageFor(model => model.paid)
        </div>
    </div>

在第一种情况下,matplotlib将尝试绘制一个元素列表的直方图,该元素是非数字的。那不行。

相反,移除支架并直接提供系列,工作正常

plt.hist([series])  # <- wrong
plt.hist(series)    # <- correct

现在,这将在同一个图中创建所有直方图。不确定是否需要。如果没有,请考虑非常短的替代方案:

for date in df.Date.unique(): 
    plt.hist(df[df.Date == '%s' %(date)]['Count'])
    plt.title('%s' %(date))

enter image description here