我想根据来自多列的条件创建一个值为1或0的新列。
使用了下面的数据集。
我尝试了以下操作:
df['passed'] = lambda x: '1' if df[(df['name']=='a') & (df['month'] <8) & (df['score']> 3.5)] else '0'
'name' : ['a', 'a', 'a','a',' a','a','a', 's', 's','s','l','a','c','a', 'e','a','g', 'd','c','d','a','f','a','a','a'],
'month' : [5, 12, 3, 12, 3, 6,7,8,9,10,11,12,4,5,2,6,7,8,3, 4, 7, 6,7,8,8],
'score' : [2.5, 5, 3.5, 2.5, 5, 3.5,2,3.5,4,2,1.5,1,1.5,4,5.5,2,3,1,2,3.5,4,2,3.5,3,4]})
这是我得到的输出:
name month score passed
0 a 5 2.5 <function <lambda> at 0x1a2050c158>
1 a 12 5.0 <function <lambda> at 0x1a2050c158>
我需要一个1或0的值,而不是“函数lambda,在0x1a2050c158>”。
答案 0 :(得分:2)
尝试使用np.where
df['passed'] = np.where( (df['name']=='a') & (df['month'] <8) & (df['score']> 3.5),1,0)
或
df['passed'] = ( (df['name']=='a') & (df['month'] <8) & (df['score']> 3.5)).astype(int)
答案 1 :(得分:1)
尝试:
df['passed'] = df.apply(lambda row: 1 if (row['name']=='a') and (row.month < 8) and (row.score > 3.5) else 0, axis = 1)
答案 2 :(得分:1)
改用pandas.Series:
import pandas as pd
input_dict = {'name' : ['a', 'a', 'a','a',' a','a','a', 's', 's','s','l','a','c','a', 'e','a','g', 'd','c','d','a','f','a','a','a'],
'month' : [5, 12, 3, 12, 3, 6,7,8,9,10,11,12,4,5,2,6,7,8,3, 4, 7, 6,7,8,8],
'score' : [2.5, 5, 3.5, 2.5, 5, 3.5,2,3.5,4,2,1.5,1,1.5,4,5.5,2,3,1,2,3.5,4,2,3.5,3,4]}
df = pd.DataFrame(input_dict)
df['passed'] = pd.Series(['1' if x=='a' and y<8 and z>3.5 else '0' for (x, y, z) in zip(df['name'].values,
df['month'].values,
df['score'].values)
])
输出:
name month score passed
0 a 5 2.5 0
1 a 12 5.0 0
2 a 3 3.5 0
3 a 12 2.5 0
4 a 3 5.0 0
5 a 6 3.5 0
6 a 7 2.0 0
7 s 8 3.5 0
8 s 9 4.0 0
9 s 10 2.0 0
10 l 11 1.5 0
11 a 12 1.0 0
12 c 4 1.5 0
13 a 5 4.0 1
14 e 2 5.5 0
15 a 6 2.0 0
16 g 7 3.0 0
17 d 8 1.0 0
18 c 3 2.0 0
19 d 4 3.5 0
20 a 7 4.0 1
21 f 6 2.0 0
22 a 7 3.5 0
23 a 8 3.0 0
24 a 8 4.0 0