使用R中的逻辑函数,用apply系列函数(或dplyr)替换循环

时间:2017-04-13 17:53:56

标签: r loops for-loop dplyr apply

我创建了这个代表性数据框,使用for循环分配条件类别。

df <- data.frame(Date=c("08/29/2011", "08/29/2011", "08/30/2011", "08/30/2011", "08/30/2011", "08/29/2012", "08/29/2012", "01/15/2012", "08/29/2012"),
             Time=c("09:45", "10:00", "13:00", "13:30", "10:14", "9:09", "11:23", "17:06", "12:20"),
             Diff = c(0.2,4.3,6.5,15.0, 16.5, 31, 30.2, 21.9, 1.9))

df1<- df %>%
  mutate(Accuracy=ifelse(Diff<=3, "Excellent", "TBD"))

for(i in 1:nrow(df1)){
  if(df1$Diff[i]>3&&df1$Diff[i]<=10){
    df1$Accuracy[i]<-"Good"} 
  if(df1$Diff[i]>10&&df1$Diff[i]<=15){
    df1$Accuracy[i]<-"Fair"} 
  if(df1$Diff[i]>15&&df1$Diff[i]<=30){
    df1$Accuracy[i]<-"Poor"}
  if(df1$Diff[i]>30){
    df1$Accuracy[i]<-"Unacceptable"}
}

我的实际数据集非常大,并且读取指示for循环通常不是在R中编码的最有效方式。我相信我可以通过为每个条件创建逻辑向量来做同样的事情,并且在每个向量内TRUE是满足每个条件。然后,我可以通过子集分配值,df1 $ Accuracy [Good]&lt; - &#34; Good&#34;例如。但是,我无法弄清楚如何使用apply族函数或dplyr函数创建逻辑向量。 (但是,任何避免for循环的解决方案也是受欢迎的。)如果for循环是更好的方法,那么知道这也会有所帮助。

这是我失败的尝试。这些返回不正确的NA或不正确的逻辑向量。我不理解的很多事情之一是lapply知道如何通过列或行。

Good<-apply(df1, 1, function(x) ifelse(df1$Diff[x]>3&& df1$Diff[x]<=10, TRUE, FALSE)) #logical, TRUE where condition is true 
Good<-unlist(lapply(df1$Diff,  function(x) {(ifelse(df1$Diff[x]>3&& df1$Diff[x]<=10, TRUE, FALSE))}))

更新:嵌套的ifelse语句可以使用,但仍然欢迎任何有关如何使用apply的建议。

mutate(Accuracy=ifelse(pDiff<=3, "Excellent", 
                         ifelse(pDiff>3&pDiff<=10, "Good",
                                ifelse(pDiff>10&pDiff<=15, "Fair",
                                       ifelse(pDiff>15&pDiff<30, "Poor",
                                              ifelse(Diff>30, "Unpublishable", "TBD"))))))  

1 个答案:

答案 0 :(得分:2)

您可以使用case_when中的dplyr

df1<- df %>%
mutate(Accuracy= case_when(
  .$Diff <=  3 ~ "Excellent",
  .$Diff <=  10  ~ "Good",
  .$Diff <=  15  ~ "Fair",
  .$Diff <=  30  ~ "Poor",
  .$Diff >   30  ~ "Unpublishable",
  TRUE  ~"TBD")
)

 df1
        Date  Time Diff      Accuracy
1 08/29/2011 09:45  0.2     Excellent
2 08/29/2011 10:00  4.3          Good
3 08/30/2011 13:00  6.5          Good
4 08/30/2011 13:30 15.0          Fair
5 08/30/2011 10:14 16.5          Poor
6 08/29/2012  9:09 31.0 Unpublishable
7 08/29/2012 11:23 30.2 Unpublishable
8 01/15/2012 17:06 21.9          Poor
9 08/29/2012 12:20  1.9     Excellent