在python中导入pandas会改变matplotlib处理日期时间对象的方式吗?

时间:2012-12-21 10:10:00

标签: python datetime matplotlib pandas

在我的debian squeeze系统上,我遇到了一个python问题,可以提炼到以下内容:

import numpy
import datetime
from matplotlib import pyplot
x = [datetime.datetime.utcfromtimestamp(i) for i in numpy.arange(100000,200000,3600)]
y = range(len(x))

# See matplotlib handle a series of datetimes just fine..
pyplot.plot(x, y)
# [<matplotlib.lines.Line2D object at 0xad10f4c>]

import pandas

# Now we try exactly what we did before..
pyplot.plot(x, y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/pymodules/python2.6/matplotlib/pyplot.py", line 2141, in plot
    ret = ax.plot(*args, **kwargs)
  File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 3432, in plot
    for line in self._get_lines(*args, **kwargs):
  File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 311, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 288, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 204, in _xy_from_xy
    bx = self.axes.xaxis.update_units(x)
  File "/usr/lib/pymodules/python2.6/matplotlib/axis.py", line 982, in update_units
    self._update_axisinfo()
  File "/usr/lib/pymodules/python2.6/matplotlib/axis.py", line 994, in _update_axisinfo
    info = self.converter.axisinfo(self.units, self)
  File "/usr/local/lib/python2.6/dist-packages/pandas/tseries/converter.py", line 184, in axisinfo
    majfmt = PandasAutoDateFormatter(majloc, tz=tz)
  File "/usr/local/lib/python2.6/dist-packages/pandas/tseries/converter.py", line 195, in __init__
    dates.AutoDateFormatter.__init__(self, locator, tz, defaultfmt)
TypeError: __init__() takes at most 3 arguments (4 given)

我对显示的特定错误的原因不感兴趣,显而易见的是,pandas期望matplotlib的不同版本 - 从标准debian存储库获取一个包并且通过pip获取另一个包的风险很大,并且通过允许pip升级matplotlib,我已经“解决了”这部分问题。

真正的问题是 - 现在来了三重问题:怎么会这样,只是导入的大熊猫打破matplotlib的处理datetime对象的能力,当只有两个线早些时候熊猫显然,即使不参与在同样的操作?导入时的pandas是否会静默更改顶级命名空间中的其他模块以强制它们使用pandas方法?这是python模块可接受的行为吗?因为我需要能够依赖它,例如,导入一个随机数模块,不会默默地改变,比如说,pickle模块将随机盐应用于它所写的所有内容。

使用更多信息进行更新

python是2.6.6(当前debian从包2.6.6-3 + squeeze7稳定)

matplotlib版本是debian的0.99.3-1(来自python-matplotlib的当前debian稳定版)

pandas版本是0.9.0(安装了'pip install pandas',前一段时间 - 不是今天)

Platform是运行debian Squeeze的i386

复制步骤

  1. (显而易见)引导一个干净的debian挤压i386安装并chroot进入它。
  2. apt-get update
  3. apt-get install python python-matplotlib
  4. apt-get install python-pip build-essential python-dev
  5. pip install --upgrade numpy
  6. pip install pandas
  7. 现在开始一个交互式python会话

    import numpy
    import datetime
    # Next two lines added to original example to avoid hassle with DISPLAY in chroot
    import matplotlib
    matplotlib.use('agg')
    from matplotlib import pyplot
    
    x = [datetime.datetime.utcfromtimestamp(i) for i in numpy.arange(100000,200000,3600)]
    y = range(len(x))
    
    pyplot.plot(x, y)
    
    import pandas
    
    pyplot.plot(x, y)
    

1 个答案:

答案 0 :(得分:3)

导入pandas时,它会使用matplotlib注册一堆单位转换器。这是来自两个库的更新版本,但我认为整体行为是相同的。

In [4]: import matplotlib.units as muints

In [5]: muints.registry
Out[5]: 
  {datetime.date: <matplotlib.dates.DateConverter instance at 0x2ab8908>,
   datetime.datetime: <matplotlib.dates.DateConverter instance at 0x2ab8ab8>}


In [6]: import pandas

In [7]: muints.registry
Out[7]: 
{pandas.tseries.period.Period: <pandas.tseries.converter.PeriodConverter instance at 0x2627e60>,
 pandas.tslib.Timestamp: <pandas.tseries.converter.DatetimeConverter instance at 0x264ea28>,
 datetime.date: <pandas.tseries.converter.DatetimeConverter instance at 0x2532fc8>,
 datetime.datetime: <pandas.tseries.converter.DatetimeConverter instance at 0x2627ab8>,
 datetime.time: <pandas.tseries.converter.TimeConverter instance at 0x2532f38>}

axis使用此注册表(带有几层重定向)来确定如何格式化非数字的信息,并将其与要尝试标记的事物的类匹配(因此,词典中的条目键入datetime.*)。

我怀疑你可以通过替换dict

中的违规条目来解决这个问题