Question

我将一个大型数据帧传递给我编写的函数，以便根据条件进行一些计算，但Python返回错误。我认为这是因为我检查数据列是否等于特定值然后执行计算，否则执行另一次计算。

我试图在pandas数组中进行计算，而不是遍历每行数据，然后由于数据集较大而逐行计算。

数据的子集如下所示：

import pandas as pd
myData = pd.DataFrame({'K':[810,820,825,830,840],'Type':
['C','C','P','P','C'],'S':[978,978,978,978,978],'R':
[0.05,0.05,0.05,0.05,0.05]})

数据框中读取的功能如下：

def function(type,S,K,r):
    if type == 'C':
        calc = S / K * r
    elif type == 'P':
        calc = (S + r) / K - r * 10
    return calc

我试图通过执行以下操作将myData传递给函数：

function(myData['Type'],myData['S'],myData['K'],myData['r'])

错误消息是：

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我认为该错误与myData [＆＃39; Type＆＃39;]和条件= =＆＃39; C＆＃39;有关。有没有办法解决这个问题，还是我必须遍历数据集并计算每一行？感谢。

Answer 1

IIUC，您可以使用np.select：

condition = [myData.Type == 'C', myData.Type == 'P']

choiceList = [myData.S / myData.K * myData.R, (myData.S + myData.R) / myData.K - myData.R * 10]

np.select(condition,choiceList)

输出：

array([ 0.06037037,  0.05963415,  0.68551515,  0.67837349,  0.05821429])

Answer 2

我尝试使用myData.groupby('Type')：

a=myData.groupby('Type')
myData['calc']=a.get_group('C').S/a.get_group('C').K*a.get_group('C').R

myData['calc']=myData['calc'].fillna((a.get_group('P').S+a.get_group('P')
    .R)/a.get_group('P').K-a.get_group('P').R*10)


myData

结果显示：

Out[149]:
    K    R       S  Type    calc
0   810 0.05    978 C   0.060370
1   820 0.05    978 C   0.059634
2   825 0.05    978 P   0.685515
3   830 0.05    978 P   0.678373
4   840 0.05    978 C   0.058214

Uder 50的声誉很难添加评论！如果您喜欢，请接受它！

将pandas数据帧传递给函数然后执行条件计算

2 个答案: