执行Range binning时处理Python中的值错误

时间:2018-05-04 14:57:19

标签: python python-2.7 pandas-datareader

我正在尝试将pandas列值分类为范围值。但是当我使用Bisect

时,我会收到值错误
from pandas_datareader import data
import pandas
import bisect
import fix_yahoo_finance as yf

yf.pdr_override() 
df = data.get_data_yahoo('SPY', '2015-01-01', '2018-04-05')
df.tail(2)

def Daily_Returns(A, B):
    return (B - A)*100/A

df['OC_Return_%'] = Daily_Returns(df['Open'], df['Close'])

def b(value):
    intervals = ['Less Than -10 %','-10% to -5%','-5% to -2.5%','-2.5% to -2%','-2% to -1.5%','-1.5% to -1%','-1% to -0.5%','-0.5% to 0%','0% to 0.5%','0.5% to 1%','1% to 1.5%','1.5% to 2%','2% to 2.5%','2.5% to 5%','5% to 10%','Greater Than 10 %']
    return intervals[bisect.bisect_left([-float('inf'),-10,-5,-2.5,-2,-1.5,-1,-0.5,0,0.5,1,1.5,2,2.5,5,10,float('inf')], value)-1]

df['OC_Return_Bin'] = b(df["OC_Return_%"])
df

如果我使用a.any()或a.all(),则错误消失。 但它正在用错误的值填充结果列。

这是评论中要求的整个追溯。

ValueError                                Traceback (most recent call last)
<ipython-input-80-a571e502f6a6> in <module>()
 17     return intervals[bisect.bisect_left([-float('inf'),-10,-5,-2.5,-2,-1.5,-1,-0.5,0,0.5,1,1.5,2,2.5,5,10,float('inf')], value)-1]
 18 
 19 df['OC_Return_Bin'] = b(df["OC_Return_%"])
 20 df

<ipython-input-80-a571e502f6a6> in b(value)
 15 def b(value):
 16     intervals = ['Less Than -10 %','-10% to -5%','-5% to -2.5%','-2.5% to -2%','-2% to -1.5%','-1.5% to -1%','-1% to -0.5%','-0.5% to 0%','0% to 0.5%','0.5% to 1%','1% to 1.5%','1.5% to 2%','2% to 2.5%','2.5% to 5%','5% to 10%','Greater Than 10 %']
 17     return intervals[bisect.bisect_left([-float('inf'),-10,-5,-2.5,-2,-1.5,-1,-0.5,0,0.5,1,1.5,2,2.5,5,10,float('inf')], value)-1]
 18 
 19 df['OC_Return_Bin'] = b(df["OC_Return_%"])

C:\Users\USER\Anaconda2\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
953         raise ValueError("The truth value of a {0} is ambiguous. "
954                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
955                          .format(self.__class__.__name__))
956 
957     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

1 个答案:

答案 0 :(得分:1)

问题是你的函数“b”不能处理一系列值,它只能处理单个值。要解决此问题,您可以使用DataFrame.apply,例如df['OC_Return_Bin'] = df["OC_Return_%"].apply(b)或使其能够使用系列。