由于在列中存在NA,Mutate和ifelse()会失败

时间:2016-09-08 03:59:44

标签: r dplyr

我在尝试使用ifelse创建新列时遇到了问题。相似的问题是dplyr error: strange issue when combining group_by, mutate and ifelse. Is it a bug?

set.seed(101)
time =sort(runif(10,0,10))  
group=rep(c(1,2),each=5)
az=c(sort(runif(5,-1,1),decreasing = T),sort(runif(5,-1,0.2),decreasing = T))

df <- data.frame(time,az,group)

#       time          az group
#1  0.4382482  0.86326886     1
#2  2.4985572  0.75959146     1
#3  3.0005483  0.46394519     1
#4  3.3346714  0.41374948     1
#5  3.7219838 -0.08975881     1
#6  5.4582855 -0.01547669     2
#7  5.8486663 -0.29161632     2
#8  6.2201196 -0.50599980     2
#9  6.5769040 -0.73105782     2
#10 7.0968402 -0.95366733     2

df我试图条件变异clas列。但是,由于NAsw_time clas所有NA列也变为group 1,其中nrm应该是df1 <- df%>% group_by(group)%>% mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]))%>% mutate(clas=as.numeric(ifelse(sw_time<3,"nrm","abn"))) Source: local data frame [10 x 5] Groups: group [2] time az group sw_time clas (dbl) (dbl) (dbl) (dbl) (dbl) 1 0.4382482 0.86326886 1 2.060309 NA 2 2.4985572 0.75959146 1 2.060309 NA 3 3.0005483 0.46394519 1 2.060309 NA 4 3.3346714 0.41374948 1 2.060309 NA 5 3.7219838 -0.08975881 1 2.060309 NA 6 5.4582855 -0.01547669 2 NA NA 7 5.8486663 -0.29161632 2 NA NA 8 6.2201196 -0.50599980 2 NA NA 9 6.5769040 -0.73105782 2 NA NA 10 7.0968402 -0.95366733 2 NA NA

let tap: UITapGestureRecognizer = UITapGestureRecognizer(target: self, action: Selector("dismiss:"))
view.addGestureRecognizer(tap)

func dismiss(gest : UITapGestureRecognizer){
        view.endEditing(true)
    }

提前感谢您的行动!

1 个答案:

答案 0 :(得分:2)

character课转换为numeric,会产生NA。相反,我们可能需要一个强制factor

numeric
df %>%
    group_by(group)%>%
     mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
            clas=as.integer(factor(ifelse(sw_time<3,"nrm","abn"))))

如果我们只对'nrm','abn'感兴趣,只需删除as.integer(factor包装

df%>%
  group_by(group)%>%
  mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
          clas=ifelse(sw_time<3,"nrm","abn"))
#        time          az group  sw_time  clas
#       <dbl>       <dbl> <dbl>    <dbl> <chr>
#1  0.4382482  0.86326886     1 2.060309   nrm
#2  2.4985572  0.75959146     1 2.060309   nrm
#3  3.0005483  0.46394519     1 2.060309   nrm
#4  3.3346714  0.41374948     1 2.060309   nrm
#5  3.7219838 -0.08975881     1 2.060309   nrm
#6  5.4582855 -0.01547669     2       NA  <NA>
#7  5.8486663 -0.29161632     2       NA  <NA>
#8  6.2201196 -0.50599980     2       NA  <NA>
#9  6.5769040 -0.73105782     2       NA  <NA>
#10 7.0968402 -0.95366733     2       NA  <NA>

我们也可以使用data.table

library(data.table)
setDT(df)[, c("sw_time", "clas") := {
           v1 <- abs(time[which(az <= 0.8)[1]] - time[which(az > 0)[1]])
          .(v1 , c("abn", "nrm")[(v1 < 3) + 1]) },
                      by = group]

如果最终输出不涉及'nrm','abn',我们不需要ifelse部分。我们可以直接使用as.integer(sw_time <3)