Question

我想根据条件对数据进行一些统计处理。但是我在下面的if statement阶段仍然会收到此错误，并且我认为最有可能发生此错误是因为我无法访问浮点对象Q11中的值。

ValueError：系列的真值不明确。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）

import pandas as pd

raw_data = {'patient': [242, 151, 111,122, 342],
        'obs': [1, 2, 3, 1, 2],
        'treatment': [0, 1, 0, 1, 0],
        'score': ['strong', 'weak', 'weak', 'weak', 'strong']}

df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])

#print(df)



     patient  obs  treatment   score
0      242    1          0    strong
1      151    2          1      weak
2      111    3          0      weak
3      122    1          1      weak
4      342    2          0    strong

我定义了获取信息的流程

df_g=df.groupby("score")

veni_vidi = []

for col in df.columns:

    if col=='patient':

        Q11 = df_g[col].transform(lambda group: np.percentile(group, q=25))

        Q11.reset_index(inplace=True,drop=True) #trying to drop index from here but it seems not working!

        for val in df[col]:



            if val < Q11:  #This is giving error because of index I guess
                veni_vidi.append('veni')

            else:
                veni_vidi.append('vici')

我试图通过这样做摆脱索引；

Q11.reset_index(inplace=True,drop=True)

0    267.0
1    116.5
2    116.5
3    116.5
4    267.0
Name: patient, dtype: float64

但不能解决问题。

提前谢谢！

Answer 1

我们可以使用np.where

进行修复

df_g=df.groupby("score")

veni_vidi = []

for col in df.columns:

    if col=='patient':

        Q11 = df_g[col].transform(lambda group: np.percentile(group, q=25))


        for val in df[col]:

            veni_vidi.append(np.where(val < Q11,'veni','vici'))

无法从float64对象中删除索引信息

1 个答案: