为什么KeyError使用matplotlib pottas pandas数据框?

时间:2013-07-25 07:42:52

标签: python matplotlib pandas ipython

我有这个数据框:

       date_obj      col1     col2       col3      col4
40038  2012-11-19   1.000   0.831856   0.986209   0.843919
40039  2012-11-20   2.015   0.521764   1.177320   0.938245
40040  2012-11-21   1.160   1.645345   1.964620   4.536440
40041  2012-11-22   3.171   2.444018   2.931550   3.737840
40042  2012-11-23   4.563   3.208111   3.587250   2.434040
40043  2012-11-24   5.379   3.863732   3.824540   1.634780
40044  2012-11-26   1.125  20.756739   4.162820  23.552100
40045  2012-11-27   3.340   5.369354   4.535090   1.129290
40046  2012-11-28   5.463  12.185730   8.102790   1.224300
40047  2012-11-29   6.596  14.328685   9.271000  24.655600
40048  2012-11-30  31.544  13.513497  12.103400  21.273500
40049  2012-12-01  24.921  26.144050  16.256200  13.883100
40050  2012-12-03   5.488   2.581351   7.220790   3.349450
40051  2012-12-04   6.977   5.893819   5.548870   2.948770
40052  2012-12-05   7.115   6.533022   5.863820   2.517030
40053  2012-12-06   5.842   8.754232   7.518660   1.447940
40054  2012-12-07   6.346  12.018631  10.263100  11.837400
40055  2012-12-08  17.666   4.548846  10.610400  11.110800
40056  2012-12-10   4.300   2.823566   1.475000   1.989210
40057  2012-12-11   2.415   2.436319   2.677440   2.908270
40058  2012-12-12   2.319   2.121092   3.455550   3.890480
40059  2012-12-13   1.000   1.633918   3.858540   4.316940
40060  2012-12-14   2.238   1.688475   5.065990   5.267850
40061  2012-12-15   1.798   2.621267   7.175370   6.957340

我尝试以下列方式绘制它:

plt.figure(figsize=(17, 10))
plt.setp(plt.xticks()[1], rotation=45)
plt.plot_date(df_cut['date_obj'],df_cut['col1'], color='black', linestyle='-', markersize=3, linewidth=2)
plt.plot_date(df_cut['date_obj'],df_cut['col2'], color='red', linestyle='-', markersize=3)
plt.plot_date(df_cut['date_obj'],df_cut['col3'], color='green', linestyle='-', markersize=3)
plt.plot_date(df_cut['date_obj'],df_cut['col4'], color='blue', linestyle='-', markersize=3)

结果我收到一个错误:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-544-1b8650d1e7e7> in <module>()
/ipython/local/lib/python2.7/site-packages/matplotlib/pyplot.pyc in plot_date(x, y, fmt, tz, xdate, ydate, hold, **kwargs)
   2850     try:
   2851         ret = ax.plot_date(x, y, fmt=fmt, tz=tz, xdate=xdate, ydate=ydate,
-> 2852                            **kwargs)
   2853         draw_if_interactive()
   2854     finally:
ipython/local/lib/python2.7/site-packages/matplotlib/axes.pyc in plot_date(self, x, y, fmt, tz, xdate, ydate, **kwargs)
   4061         if not self._hold: self.cla()
   4062 
-> 4063         ret = self.plot(x, y, fmt, **kwargs)
   4064 
   4065         if xdate:
ipython/local/lib/python2.7/site-packages/matplotlib/axes.pyc in plot(self, *args, **kwargs)
   3994         lines = []
   3995 
-> 3996         for line in self._get_lines(*args, **kwargs):
   3997             self.add_line(line)
   3998             lines.append(line)
ipython/local/lib/python2.7/site-packages/matplotlib/axes.pyc in _grab_next_args(self, *args, **kwargs)
    328                 return
    329             if len(remaining) <= 3:
--> 330                 for seg in self._plot_args(remaining, kwargs):
    331                     yield seg
    332                 return
ipython/local/lib/python2.7/site-packages/matplotlib/axes.pyc in _plot_args(self, tup, kwargs)
    306             x = np.arange(y.shape[0], dtype=float)
    307 
--> 308         x, y = self._xy_from_xy(x, y)
    309 
    310         if self.command == 'plot':
python/local/lib/python2.7/site-packages/matplotlib/axes.pyc in _xy_from_xy(self, x, y)
    222     def _xy_from_xy(self, x, y):
    223         if self.axes.xaxis is not None and self.axes.yaxis is not None:
--> 224             bx = self.axes.xaxis.update_units(x)
    225             by = self.axes.yaxis.update_units(y)
    226 
ipython/local/lib/python2.7/site-packages/matplotlib/axis.pyc in update_units(self, data)
   1299         neednew = self.converter != converter
   1300         self.converter = converter
-> 1301         default = self.converter.default_units(data, self)
   1302         #print 'update units: default=%s, units=%s'%(default, self.units)
   1303         if default is not None and self.units is None:
ipython/local/lib/python2.7/site-packages/matplotlib/dates.pyc in default_units(x, axis)
   1156         'Return the tzinfo instance of *x* or of its first element, or None'
   1157         try:
-> 1158             x = x[0]
   1159         except (TypeError, IndexError):
   1160             pass
ipython/local/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
    611     def __getitem__(self, key):
    612         try:
--> 613             return self.index.get_value(self, key)
    614         except InvalidIndexError:
    615             pass
ipython/local/lib/python2.7/site-packages/pandas/core/index.pyc in get_value(self, series, key)
    761         """
    762         try:
--> 763             return self._engine.get_value(series, key)
    764         except KeyError, e1:
    765             if len(self) > 0 and self.inferred_type == 'integer':

奇怪的是,这段代码适用于某些数据框,有些则不适用。数据框的结构没有不同。它们之间的唯一区别仅在于它们包含的值。

有人可以帮我解决这个问题吗?

1 个答案:

答案 0 :(得分:2)

Dataframe将日期存储为numpy.datetime64对象,而不是python datetime对象。

此外,matplotlib.plot_date使用自己的日期数字表示法。

您可以这样绘制数据:

plt.plot_date(matplotlib.dates.date2num(pandas.to_datetime(df_cut['date_obj'].values)),df_cut['col1'].values, color='black', linestyle='-', markersize=3, linewidth=2)

或者您可以将列'date_obj'定义为数据的索引:

df0 = pd.DataFrame.from_records(YourDataSource, columns=['date_obj','col1','col2','col3','col4'],index='date_obj')

然后简单地使用pandas的plot()属性:

df0['col1'].plot()