我遇到了pandas数据帧与numpy histogram2d函数交互的问题。特别是当这段代码正常执行时
import numpy
import pandas
df = pandas.DataFrame(np.random.randn(100, 2), columns=list('AB'))
hist, xe, ye = numpy.histogram2d(df["A"], df["B"])
此代码,我使用DataFrame的子集创建直方图失败
import numpy
import pandas
df = pandas.DataFrame(np.random.randn(100, 2), columns=list('AB'))
dfSubset = pandas.DataFrame(df[df["A"] < 0])
hist, xe, ye = numpy.histogram2d(dfSubset["A"], dfSubset["B"])
出现以下异常
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-763e2355a7e1> in <module>()
1 dfSubset = pandas.DataFrame(df[df["A"] < 0])
----> 2 hist, xe, ye = numpy.histogram2d(dfSubset["A"], dfSubset["B"])
/home/mark/.virtualenvs/ipython/lib/python2.6/site-packages/numpy/lib/twodim_base.pyc in histogram2d(x, y, bins, range, normed, weights)
651 xedges = yedges = asarray(bins, float)
652 bins = [xedges, yedges]
--> 653 hist, edges = histogramdd([x, y], bins, range, normed, weights)
654 return hist, edges[0], edges[1]
655
/home/mark/.virtualenvs/ipython/lib/python2.6/site-packages/numpy/lib/function_base.pyc in histogramdd(sample, bins, range, normed, weights)
312 smax = ones(D)
313 else:
--> 314 smin = atleast_1d(array(sample.min(0), float))
315 smax = atleast_1d(array(sample.max(0), float))
316 else:
/home/mark/.virtualenvs/ipython/lib/python2.6/site-packages/numpy/core/_methods.pyc in _amin(a, axis, out, keepdims)
19 def _amin(a, axis=None, out=None, keepdims=False):
20 return um.minimum.reduce(a, axis=axis,
---> 21 out=out, keepdims=keepdims)
22
23 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
/home/mark/.virtualenvs/ipython/lib/python2.6/site-packages/pandas/core/generic.pyc in __nonzero__(self)
663 raise ValueError("The truth value of a {0} is ambiguous. "
664 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 665 .format(self.__class__.__name__))
666
667 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我从一些搜索中得知,python容器的真值应返回的是一个有争议的问题,并且大熊猫和numpy期望这种行为是不匹配的。我不知道的是如何将问题解决为实际问题。
有人可以建议解决这个问题吗?
我正在运行带有iPython笔记本的python 2.6.6以及我正在运行的虚拟环境中的以下软件包:
Babel==0.9.4
Beaker==1.3.1
Jinja2==2.2.1
Magic-file-extensions==0.1
Mako==0.3.4
MarkupSafe==0.9.2
OpenEye-python2.6-redhat-6-x64==2013.10.3
PIL==1.1.6
Pygments==1.1.1
SSSDConfig==1.9.2
Sphinx==0.6.6
argparse==1.2.1
backports.ssl-match-hostname==3.4.0.2
cas==0.15
cups==1.0
cupshelpers==1.0
decorator==3.0.1
docutils==0.6
ethtool==0.6
firstboot==1.110
freeipa==2.0.0.alpha.0
git-remote-helpers==0.1.0
iniparse==0.3.1
iotop==0.3.2
ipapython==3.0.0
ipython==1.1.0
iwlib==1.0
kerberos==1.0
lxml==2.2.3
matplotlib==1.1.1
netaddr==0.7.5
nose==0.10.4
numpy==1.8.0
pandas==0.13.0
paramiko==1.7.5
patsy==0.2.1
pyOpenSSL==0.10
pycrypto==2.0.1
pycurl==7.19.0
pygpgme==0.1
python-dateutil==2.2
python-default-encoding==0.1
python-ldap==2.3.10
python-meh==0.11
python-nss==0.11
pytz==2013.9
pyxdg==0.18
pyzmq==14.0.1
qpid-python==0.14
qpid-tools==0.14
scdate==1.9.60
scikit-learn==0.14.1
scipy==0.13.2
sckdump==2.0.5
scservices==0.99.45
scservices.dbus==0.99.45
six==1.5.2
slip==0.2.20
slip.dbus==0.2.20
slip.gtk==0.2.20
smbc==1.0
stevedore==0.13
sympy==0.7.4.1
tornado==3.2
urlgrabber==3.9.1
virtinst==0.600.0
virtualenv==1.11.1
virtualenv-clone==0.2.4
virtualenvwrapper==4.2
yum-metadata-parser==1.1.2
谢谢!
答案 0 :(得分:3)
更改行:
hist, xe, ye = numpy.histogram2d(dfSubset["A"], dfSubset["B"])
为:
hist, xe, ye = numpy.histogram2d(dfSubset["A"].values, dfSubset["B"].values)
将系列强制转换为numpy数组