我想应用一个由分类规则组成的函数,根据每个参与者的性别,年龄和种族,为新列分配高,中,低风险值。
我们假设这是我的df
gender age race
1 male 11 NON_WHITE
2 male 9 WHITE
3 female 36 NON_WHITE
5 female 3 WHITE
6 female 81 WHITE
7 female 14 NON_WHITE
8 female 14 NON_WHITE
9 female 79 NON_WHITE
10 male 44 WHITE
我想根据性别,年龄和种族分配一个值。例如:
高=女性;任何年龄; NON_WHITE或男性; > = 70; NON_WHITE
中=女; > = 75;白人或男性; < 70; NON_WHITE
低=女性; < 75;白人或男性;任何年龄; WHITE
结果将是分配给df$class
的值:
gender age race class
1 male 11 NON_WHITE Medium
2 male 9 WHITE Low
3 female 36 NON_WHITE High
5 female 3 WHITE Low
6 female 81 WHITE Medium
7 female 14 NON_WHITE High
8 female 14 NON_WHITE High
9 female 79 NON_WHITE High
10 male 44 WHITE Low
我写了一个函数并将其应用到dateframe:
Riskfun <- function(x) {
if(x["gender"] == "female" & x["race"] == "NON_WHITE")
df$class <- "HighRisk"
if(x["gender"] == "male" & x["age"] >= 70 & x["race"] == "NON_WHITE")
df$class <- "HighRisk"
if(x["gender"] == "female" & x["age"] >= 75 & x["race"] == "WHITE")
df$class <- "MediumRisk"
if(x["gender"] == "male" & x["age"] < 70 & x["race"] == "NON_WHITE")
df$class <- "MediumRisk"
if(x["gender"] == "female" & x["age"] < 75 & x["race"] == "WHITE")
df$class <- "LowRisk"
if(x["gender"] == "male" & x["race"] == "WHITE")
df$class <- "LowRisk"
}
有任何想法或建议吗?
答案 0 :(得分:2)
You can use for loop element wise
for(i in 1:nrow(data)){
data$class[i]<-ifelse(data$gender[i]=="female"&data$race[i]=="NON_WHITE"|data$gender[i]=="male"&data$age[i]>=70, "High", "LOW")
data$class[i]<-ifelse(data$gender[i]=="female"&data$age[i]>=75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$age[i]<70 &data$race[i]=="NON_WHITE", "Medium", data$class[i])
data$class[i]<-ifelse(data$gender[i]=="female"&data$age[i]<75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$race[i]=="WHITE", "Low", data$class[i])
}
**Or use if statement in the loop rather than ifelse, both codes gives the same result**
for(i in 1:nrow(data)){
if(data$gender[i]=="female"&data$race[i]=="NON_WHITE"|data$gender[i]=="male"&data$age[i]>=70){
data$class[i] <- "High"}
if(data$gender[i]=="female"&data$age[i]>=75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$age[i]<70 &data$race[i]=="NON_WHITE"){
data$class[i] <- "Medium"}
if(data$gender[i]=="female"&data$age[i]<75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$race[i]=="WHITE"){
data$class[i]<- "Low"}
}
print(data)
gender age race class
1 male 11 NON_WHITE Medium
2 male 9 WHITE Low
3 female 36 NON_WHITE High
4 female 3 WHITE Low
5 female 81 WHITE Medium
6 female 14 NON_WHITE High
7 female 14 NON_WHITE High
8 female 79 NON_WHITE High
9 male 44 WHITE Low