熊猫:系列的真值是模棱两可的。使用a.empty,a.bool(),a.item(),a.any()或a.all()

时间:2017-04-26 08:32:42

标签: python pandas

我正在尝试使用一个以两列作为参数的函数在pandas数据框架中创建一个新列

def ipf_cat(var, con):
    if var == "Idiopathic pulmonary fibrosis":
       if con in range(95,100):
          result = 4
       if con in range(70,95):
          result = 3
       if con in range(50,70):
          result = 2
       if con in range(0,50):
          result = 1
    return result

然后

   df['ipf_category'] = ipf_cat(df['dx1'], df['dxcon1'])

其中df ['dx1']是一列和一个字符串,而df ['dxcon1']是另一列和0-100的整数。该函数在python中工作正常,但我不断收到此错误

 ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我见过以前的答案,例如

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

但是我无法将这些解决方案应用到我的特定功能中。

1 个答案:

答案 0 :(得分:1)

我使用pd.cut()方法:

来源DF

In [157]: df
Out[157]:
  con                            var
0  53                            ???
1  97  Idiopathic pulmonary fibrosis
2  75                            ???
3  11  Idiopathic pulmonary fibrosis
4  70                            ???
5  52  Idiopathic pulmonary fibrosis
6  74                            ???
7  25  Idiopathic pulmonary fibrosis
8  92                            ???
9  80  Idiopathic pulmonary fibrosis

解决方案:

In [158]: df['ipf_category'] = -999
     ...:
     ...: bins = [-1, 50, 70, 95, 101]
     ...: labels = [1,2,3,4]
     ...:
     ...: df.loc[df['var']=='Idiopathic pulmonary fibrosis', 'ipf_category'] = \
     ...:     pd.cut(df['con'], bins=bins, labels=labels)
     ...:

In [159]: df
Out[159]:
  con                            var  ipf_category
0  53                            ???          -999
1  97  Idiopathic pulmonary fibrosis             4
2  75                            ???          -999
3  11  Idiopathic pulmonary fibrosis             1
4  70                            ???          -999
5  52  Idiopathic pulmonary fibrosis             2
6  74                            ???          -999
7  25  Idiopathic pulmonary fibrosis             1
8  92                            ???          -999
9  80  Idiopathic pulmonary fibrosis             3

设定:

df = pd.DataFrame({
  'con':np.random.randint(100, size=10),
  'var':np.random.choice(['Idiopathic pulmonary fibrosis','???'], 10)
})