请考虑该表a,其中包含人员编号,日期(以年为单位)和数据。
a = data.table(person = c(1,1,1,2,3,3,3,4,4,5,5,5,5,5), date = c(2010,2011,2012,2010,2010,2011,2012,2010,2011,2010,2011,2012,2013,2014), data = c(9,7,6,4,3,3,5,1,6,5,7,8,4,9))
我想按人转移“日期”,所以我这样做:
a <- a[order(date)]
a[, date := shift(date, 1L, type = "lag"), by=.(person)]
person date data
1: 1 NA 9
2: 2 NA 4
3: 3 NA 3
4: 4 NA 1
5: 5 NA 5
6: 1 2010 7
7: 3 2010 3
8: 4 2010 6
9: 5 2010 7
10: 1 2011 6
11: 3 2011 5
12: 5 2011 8
13: 5 2012 4
14: 5 2013 9
这是正确的,但是当我想再次输入相同的代码来换一年(我想结果就像将日期移了2个滞后):
a <- a[order(date)]
a[, date := shift(date, 1L, type = "lag"), by=.(person)]
人们期望与人5约会的日期2013,与人4约会的日期2010,与人3约会的日期2011,与人1约会的日期2011。这是愿望(正确)结果:
person date data
1: 5 2010 9
2: 1 2010 7
3: 3 2010 3
4: 5 2011 5
5: 5 2012 7
6: 1 NA 6
7: 3 NA 5
8: 5 NA 8
9: 4 NA 1
10: 5 NA 4
11: 1 NA 9
12: 3 NA 3
13: 4 NA 6
14: 2 NA 4
再次执行移位操作的奇怪输出给出:
person date data
1: 1 2010 6
2: 3 2010 5
3: 5 2010 8
4: 4 2010 1
5: 5 2011 4
6: 1 2011 9
7: 3 2011 3
8: 5 2012 9
9: 5 2013 5
10: 1 NA 7
11: 3 NA 3
12: 4 NA 6
13: 5 NA 7
14: 2 NA 4
似乎是在回收观察?
答案 0 :(得分:1)
删除第二次重新分配和order
通话。 order(date)
将NA
的值放在末尾。 shift
只是一个向量,并且由于NA
值现在位于末尾,因此它们被shift
取代,而不是您期望的date
值:
或者,在您的order
调用中,您可以使用na.last
参数,即a <- a[order(date, na.last = FALSE)]
library(data.table)
#> Warning: package 'data.table' was built under R version 3.4.4
a = data.table(person = c(1,1,1,2,3,3,3,4,4,5,5,5,5,5), date = c(2010,2011,2012,2010,2010,2011,2012,2010,2011,2010,2011,2012,2013,2014), data = c(9,7,6,4,3,3,5,1,6,5,7,8,4,9))
a <- a[order(date)]
a[, date := shift(date, 1L, type = "lag"), by=.(person)]
a[]
#> person date data
#> 1: 1 NA 9
#> 2: 2 NA 4
#> 3: 3 NA 3
#> 4: 4 NA 1
#> 5: 5 NA 5
#> 6: 1 2010 7
#> 7: 3 2010 3
#> 8: 4 2010 6
#> 9: 5 2010 7
#> 10: 1 2011 6
#> 11: 3 2011 5
#> 12: 5 2011 8
#> 13: 5 2012 4
#> 14: 5 2013 9
# Note I'm not reassigning here, just showing for demonstrative purposes
# Notice NA placement
a[order(date), ]
#> person date data
#> 1: 1 2010 7
#> 2: 3 2010 3
#> 3: 4 2010 6
#> 4: 5 2010 7
#> 5: 1 2011 6
#> 6: 3 2011 5
#> 7: 5 2011 8
#> 8: 5 2012 4
#> 9: 5 2013 9
#> 10: 1 NA 9
#> 11: 2 NA 4
#> 12: 3 NA 3
#> 13: 4 NA 1
#> 14: 5 NA 5
# what you expect to see
a[, date := shift(date, 1L, type = "lag"), by=.(person)]
a[]
#> person date data
#> 1: 1 NA 9
#> 2: 2 NA 4
#> 3: 3 NA 3
#> 4: 4 NA 1
#> 5: 5 NA 5
#> 6: 1 NA 7
#> 7: 3 NA 3
#> 8: 4 NA 6
#> 9: 5 NA 7
#> 10: 1 2010 6
#> 11: 3 2010 5
#> 12: 5 2010 8
#> 13: 5 2011 4
#> 14: 5 2012 9
由reprex package(v0.2.1)于2019-04-24创建