hd.loc[(hd['ap_hi'] <= 120) & (hd['ap_lo'] < 80), ['ap_hi','ap_lo']] = 'normal'
hd.loc[(hd['ap_hi'] > 120) & (hd['ap_hi'] <= 129) & (hd['ap_lo'] < 80), ['ap_hi','ap_lo']] = 'elevated'
hd.loc[(hd['ap_hi'] > 130) & (hd['ap_hi'] <= 139) | (hd['ap_lo'] >= 80) & (hd['ap_lo'] < 89), ['ap_hi','ap_lo']] = 'high blood pressure 1'
hd.loc[(hd['ap_hi'] > 140) & (hd['ap_hi'] <= 179) | (hd['ap_lo'] > 90) & (hd['ap_lo'] <119 ), ['ap_hi','ap_lo']] = 'high blood pressure 2'
hd.loc[(hd['ap_hi'] > 180) | (hd['ap_lo'] > 120) , ['ap_hi','ap_lo']] = 'hypertensive crisis'
运行此代码时,第二行出现错误'>' not supported between instances of 'str' and 'int'
。我不知道是什么引起了错误。预先感谢。
答案 0 :(得分:1)
您正在将字符串值放在包含整数的列中。而不是这样做,为字符串创建新的列。在这里,我正在创建一个新列'bp_level':
hd.loc[(hd['ap_hi'] <= 120) & (hd['ap_lo'] < 80), 'bp_level'] = 'normal'
hd.loc[(hd['ap_hi'] > 120) & (hd['ap_hi'] <= 129) & (hd['ap_lo'] < 80), 'bp_level'] = 'elevated'
hd.loc[(hd['ap_hi'] > 130) & (hd['ap_hi'] <= 139) | (hd['ap_lo'] >= 80) & (hd['ap_lo'] < 89), 'bp_level'] = 'high blood pressure 1'
hd.loc[(hd['ap_hi'] > 140) & (hd['ap_hi'] <= 179) | (hd['ap_lo'] > 90) & (hd['ap_lo'] <119 ), 'bp_level'] = 'high blood pressure 2'
hd.loc[(hd['ap_hi'] > 180) | (hd['ap_lo'] > 120) , 'bp_level'] = 'hypertensive crisis'
如果要覆盖这些列,请在完成所有比较之后执行此操作:
hd.loc[:,['ap_hi', 'ap_lo']] = hd['bp_level']
这是一个更简单,有效的示例(已通过Python 3.8和pandas 1.0.5测试):
import pandas as pd
df = pd.DataFrame({'A':range(10)})
df.loc[(df['A'] < 3), 'B'] = '<3'
df.loc[(df['A'] < 6) & (df['A'] >= 3), 'B'] = '3 to <6'
df.loc[(df['A'] >= 6), 'B'] = '6+'
print(df)
产生:
A B
0 0 <3
1 1 <3
2 2 <3
3 3 3 to <6
4 4 3 to <6
5 5 3 to <6
6 6 6+
7 7 6+
8 8 6+
9 9 6+