IPython + Pandas无法从.csv中绘制数据

时间:2013-09-09 11:52:43

标签: python-3.x pandas ipython-notebook

我在IPython中使用Pandas导入csv。显示DataFrame时,它看起来像:


     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935

现在我想在图表中绘制数据,但无论我尝试什么,我都会得到“ TypeError:Empty'DataFrame':没有要绘制的数字数据

显然,DataFrame不是空的,并且充满了数字。我错过了什么?我的印象是Pandas自己确定了数字。

2 个答案:

答案 0 :(得分:3)

感谢所有的建议!它指出了我正确的方向。我设法用

解决了这个问题
df = df.replace(',', '', regex=True)
df = df.replace('-', 'NaN', regex=True).astype('float')
df.plot()

答案 1 :(得分:2)

获取您的数据,并将“,”替换为“。”,再加上“ - ”替换为“NaN”,它可以正常工作:

>>> s="""     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935"""

>>> s=s.replace(',','.')    
>>> s=s.replace('-','NaN')    
>>> df=pd.read_csv(StringIO(s), sep='\s*')
>>> df.plot()
<matplotlib.axes.AxesSubplot at 0x88a4790>

有趣的是,从read_csv docstring,有一个参数指定小数分隔符,但它似乎不适用于我的版本(0.11.0)。