Question

我的输入数据是什么：

存储在具有已定义索引的pandas数据帧中的三角矩阵和列名称
与数据框中的列数相同的长度列表
将列表中的项目作为输入

接下来我想做什么：

根据数据框中的值将函数应用于列表
根据数据框的列

一个小例子：

scores = np.array([[1,2,1.5,0.75],
                 [0,1,0.75,1.25],
                 [0,0,1,2],
                 [0,0,0,1]])
names = ['Andy','Bob','Craig','Dan']

bets = [100,120,135,130]

def getPrize(bet, x): # x defined somewhere elsewhere
    prize = bet*x #do stuff here
    return prize

names1 = ['Andy1','Bob1','Craig1','Dan1']

Results = pd.DataFrame(data=scores,index=names1,columns=names1)

现在，我在数据框中的值上定义一个条件，并根据该条件，我想找到相关列的位置（整数值好像 - 反之 - 我正在使用df.iloc来查找它）。

我试过的是：

for i, r in Results.iterrows():
    found = r[r>1]
    col_index = r.columns.get_loc(found)
    print col_index

但在这里，我面临AttributeError: 'Series' object has no attribute 'columns'的问题。但如果我写这个：

col_ix, col_name = found.iteritems()

我得到ValueError: need more than 1 value to unpack - 所以我没有正确使用iteritems？但是，如果我print值，则会在抛出错误之前打印它们。

最后，我希望获得一个奖项＆＃34;奖金＆＃34;在y轴上和x轴上的名字，绘制每个人所选的（按条件）值的值（所以我想要实现的另一件事是找到{{1}的哪个项目} list是我生成的每个列名的子字符串。）

Answer 1

最简单的只有多个：

print (results.mul(np.array(bets)))
        Andy    Bob   Craig    Dan
Andy   100.0  240.0  202.50   97.5
Bob      0.0  120.0  101.25  162.5
Craig    0.0    0.0  135.00  260.0
Dan      0.0    0.0    0.00  130.0

但如果实际功能更复杂，请使用DataFrame.apply：

def getPrize(bet,score):
    #working with Series score and list bets
    print (bet)
    print (score)
    prize = bet*score
    return prize

df = results.apply(lambda x: getPrize(bets, x), axis=1)
print (df)

        Andy    Bob   Craig    Dan
Andy   100.0  240.0  202.50   97.5
Bob      0.0  120.0  101.25  162.5
Craig    0.0    0.0  135.00  260.0
Dan      0.0    0.0    0.00  130.0

plt.xticks(np.arange(len(df.columns)), df.columns)
plt.plot(df.values)

编辑：

如果需要所有列的位置list comprehension（或某个循环），因为Index.get_loc仅适用于标量：

for i, r in Results.iterrows():
    found = r[r>1]
    col_index = [r.index.get_loc(x) for x in found.index]
    print (col_index)

[1, 2]
[3]
[3]
[]

在pandas数据帧中查找某些列名称和位置

1 个答案: