我正在尝试创建一个新列,在数据类型为float的列上应用多个条件。
Sample data:
ID CO
0 12.0
1 11.0
2 8.0
3 6.5
4 5.5
5 5.7
6 5.8
7 6.5
8 6.8
for index, row in df.iterrows():
if row['CO'] in arange(0,1.54):
row.loc['CO_1'] = 'GOOD'
elif row['CO'] in arange(1.54,1.70):
row.loc['CO_1'] = 'MOD'
上面没有用,所以我试着单独写一个函数:
def aqi_CO(row):
val_1=0
for x in row:
if x in arange(0,0.054):
val_1 = 'GOOD'
elif x in arange(0.054,0.070):
val_1 = 'MODERATE'
elif x in arange(0.070,0.085):
val_1 = 'UNHEALTHY_SG'
elif x in arange(0.085,0.105):
val_1 = 'UNHEALTHY'
elif x in arange(0.105,0.200):
val_1 = 'VERY_UNHEALTHY'
elif x in arange(0.200,3):
val_1 = 'HAZARDOUS'
return val_1
并通过apply:
调用它df['aqi_CO'] = df.apply(lambda x: aqi_CO(df['CO']), axis=1)
这并没有奏效。我现在很困惑,有人可以帮助我如何逐行添加新列迭代数据帧并检查3,4条件以创建新列。
答案 0 :(得分:1)
使用pd.cut
pd.cut(df.CO,bins=[0,2,4,6,8,9,100],labels=["GOOD","MODERATE","UNHEALTHY_SG","UNHEALTHY","VERY_UNHEALTHY","HAZARDOUS"])
Out[866]:
0 HAZARDOUS
1 HAZARDOUS
2 UNHEALTHY
3 UNHEALTHY
4 UNHEALTHY_SG
5 UNHEALTHY_SG
6 UNHEALTHY_SG
7 UNHEALTHY
8 UNHEALTHY
Name: CO, dtype: category
df['new']=pd.cut(df.CO,bins=[0,2,4,6,8,9,100],labels=["GOOD","MODERATE","UNHEALTHY_SG","UNHEALTHY","VERY_UNHEALTHY","HAZARDOUS"])
df
Out[868]:
ID CO new
0 0 12.0 HAZARDOUS
1 1 11.0 HAZARDOUS
2 2 8.0 UNHEALTHY
3 3 6.5 UNHEALTHY
4 4 5.5 UNHEALTHY_SG
5 5 5.7 UNHEALTHY_SG
6 6 5.8 UNHEALTHY_SG
7 7 6.5 UNHEALTHY
8 8 6.8 UNHEALTHY
答案 1 :(得分:0)
在您的第一段代码中:
json_encode()
返回arange(0,1.54)
,样本数据中没有任何内容。但是,如果你愿意的话
然后检查,你可以增加范围和步长。
对于array([ 0., 1.])
之类的内容,对于for循环中的下一步,您使用arange(0, 7, 0.1)
与.loc
而index
代替dataframe
,row
而不是df.loc[index,'CO_1'] = 'GOOD'
:
row.loc['CO_1'] = 'GOOD'
结果:
for index, row in df.iterrows():
if row['CO'] in arange(0, 7, 0.1):
df.loc[index,'CO_1'] = 'GOOD'
elif row['CO'] in arange(1.54,1.70):
df.loc[index,'CO_1'] = 'MOD'
同样,对于代码的第二个片段,可能正在使用lambda并仅应用于列:
CO CO_1
ID
0 12.0 NaN
1 11.0 NaN
2 8.0 NaN
3 6.5 GOOD
4 5.5 GOOD
5 5.7 GOOD
6 5.8 NaN
7 6.5 GOOD
8 6.8 NaN
现在,由于只传递了列值,因此可以在函数中不进行迭代检查(注意:第一种情况的函数范围已更改,因此可以看到该输出):
df['aqi_CO'] = df['CO'].apply(lambda x: aqi_CO(x))
结果:
def aqi_CO(x):
val_1=0
if x in arange(0,7, 0.1):
val_1 = 'GOOD'
elif x in arange(0.054,0.070):
val_1 = 'MODERATE'
elif x in arange(0.070,0.085):
val_1 = 'UNHEALTHY_SG'
elif x in arange(0.085,0.105):
val_1 = 'UNHEALTHY'
elif x in arange(0.105,0.200):
val_1 = 'VERY_UNHEALTHY'
elif x in arange(0.200,3):
val_1 = 'HAZARDOUS'
return val_1