使用R

时间:2016-05-23 11:48:28

标签: r function if-statement apply

我想应用一个由分类规则组成的函数,根据每个参与者的性别,年龄和种族,为新列分配高,中,低风险值。

我们假设这是我的df

   gender age      race
1    male  11 NON_WHITE
2    male   9     WHITE
3  female  36 NON_WHITE
5  female   3     WHITE
6  female  81     WHITE
7  female  14 NON_WHITE
8  female  14 NON_WHITE
9  female  79 NON_WHITE
10   male  44     WHITE

我想根据性别,年龄和种族分配一个值。例如:

高=女性;任何年龄; NON_WHITE或男性; > = 70; NON_WHITE

中=女; > = 75;白人或男性; < 70; NON_WHITE

低=女性; < 75;白人或男性;任何年龄; WHITE

结果将是分配给df$class的值:

  gender age      race   class
1    male  11 NON_WHITE  Medium
2    male   9     WHITE     Low
3  female  36 NON_WHITE    High
5  female   3     WHITE     Low
6  female  81     WHITE  Medium
7  female  14 NON_WHITE    High
8  female  14 NON_WHITE    High
9  female  79 NON_WHITE    High
10   male  44     WHITE     Low

我写了一个函数并将其应用到dateframe:

Riskfun <- function(x) { 
if(x["gender"] == "female" & x["race"] == "NON_WHITE") 
    df$class <- "HighRisk"
if(x["gender"] == "male" & x["age"] >= 70 & x["race"] == "NON_WHITE") 
    df$class <- "HighRisk"
if(x["gender"] == "female" & x["age"] >= 75 & x["race"] == "WHITE") 
    df$class <- "MediumRisk"
if(x["gender"] == "male" & x["age"] < 70 & x["race"] == "NON_WHITE") 
    df$class <- "MediumRisk"
if(x["gender"] == "female" & x["age"] < 75 & x["race"] == "WHITE") 
    df$class <- "LowRisk"
if(x["gender"] == "male" & x["race"] == "WHITE") 
    df$class <- "LowRisk"
 }

有任何想法或建议吗?

1 个答案:

答案 0 :(得分:2)

            You can use for loop element wise

            for(i in 1:nrow(data)){
              data$class[i]<-ifelse(data$gender[i]=="female"&data$race[i]=="NON_WHITE"|data$gender[i]=="male"&data$age[i]>=70, "High", "LOW")
              data$class[i]<-ifelse(data$gender[i]=="female"&data$age[i]>=75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$age[i]<70 &data$race[i]=="NON_WHITE", "Medium", data$class[i])
              data$class[i]<-ifelse(data$gender[i]=="female"&data$age[i]<75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$race[i]=="WHITE", "Low", data$class[i])
            }

 **Or use if statement in the loop rather than ifelse, both codes gives the same result**

for(i in 1:nrow(data)){
  if(data$gender[i]=="female"&data$race[i]=="NON_WHITE"|data$gender[i]=="male"&data$age[i]>=70){
  data$class[i] <- "High"}
  if(data$gender[i]=="female"&data$age[i]>=75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$age[i]<70 &data$race[i]=="NON_WHITE"){
  data$class[i] <- "Medium"}
  if(data$gender[i]=="female"&data$age[i]<75&data$race[i]=="WHITE"|data$gender[i]=="male"&data$race[i]=="WHITE"){
  data$class[i]<- "Low"}
  }

    print(data)
         gender age      race  class
        1   male  11 NON_WHITE Medium
        2   male   9     WHITE    Low
        3 female  36 NON_WHITE   High
        4 female   3     WHITE    Low
        5 female  81     WHITE Medium
        6 female  14 NON_WHITE   High
        7 female  14 NON_WHITE   High
        8 female  79 NON_WHITE   High
        9   male  44     WHITE    Low