R:当另一列中的变量名更改时,在一列中插入NAs

时间:2017-01-24 15:17:58

标签: r data.table na

我有一个如下数据框:

library(data.table)
set.seed(1234)
DT<-data.table(x=c("a","a","a","b","b","c","c","c","d","d","d","d"),v=sample(1:4,12,replace = T))

  x v
  a 1
  a 3
  a 3
  b 3
  b 4
  c 3
  c 1
  c 1
  d 3
  d 3
  d 3
  d 3

我需要做的是每次变量“x”改变时有条件地替换值“v”,如下所示:

      x v
      a 1
      a 3
      a 3
      b NA
      b 4
      c NA
      c 1
      c 1
      d NA
      d 3
      d 3
      d 3

我是否必须做一个循环或者有一个班轮来做同样的事情? 谢谢!

1 个答案:

答案 0 :(得分:3)

是的,有一个单行:

DT[x != shift(x), v := NA]


    x  v
 1: a  1
 2: a  3
 3: a  3
 4: b NA
 5: b  4
 6: c NA
 7: c  1
 8: c  1
 9: d NA
10: d  3
11: d  3
12: d  3

有关此语法的详细信息,请参阅?shiftthe data.table vignettes

或者,为了避免计算shift和完整的!=比较...

DT[DT[, if (.GRP > 1L) .I[1L], by=rleid(x)]$V1, v := NA ]

关注@eddi's approach to subsetting by group。有关详细信息,请参阅?.GRP?.I?rleid