我有以下数据表:
library(data.table)
dt <- data.table("one" =c(100,200,300,400,500,600))
dt <- dt[,two:=round(one*1.05,0)][,three:=round(two*1.03,0)][,four:=round(three*1.07,0)][,five:=round(four*1.05,0)][,six:=round(five*1.1,0)][,curr:=c(3,4,5,6)]
> dt
one two three four five six curr
1: 100 105 108 116 122 134 3
2: 200 210 216 231 243 267 4
3: 300 315 324 347 364 400 5
4: 400 420 433 463 486 535 6
5: 500 525 541 579 608 669 3
6: 600 630 649 694 729 802 4
我想使用data.table来逐行抓取列中的数字&#34; curr&#34;并计算表中该列号的平均值以及紧接其前一列的列数。例如,对于第一行,它将在列#34中获取108的值;三个&#34;栏中的值为105&#34; 2&#34;并给出106.5。对于第二行,它将取平均值231和216等等。
答案 0 :(得分:1)
这将完成工作:
# create a column with row positions
dt[, rowpos := .I]
# create a function that will be applied to every pair of column and row position
myfunc <- function(colpos,rowpos) { mean(c(as.matrix(dt)[rowpos,colpos],
as.matrix(dt)[rowpos,(colpos-1)])) }
# apply function
dt[ , var := myfunc(curr, rowpos) , by = rowpos]
#> one two three four five six curr rowpos var
#> 1: 100 105 108 116 122 134 3 1 106.5
#> 2: 200 210 216 231 243 267 4 2 223.5
#> 3: 300 315 324 347 364 400 5 3 355.5
#> 4: 400 420 433 463 486 535 6 4 510.5
#> 5: 500 525 541 579 608 669 3 5 533.0
#> 6: 600 630 649 694 729 802 4 6 671.5
答案 1 :(得分:0)
这是另一个解决方案(回答我自己的问题):
dt[, rowpos := .I] #Using Rafa's idea above
dt[, c("new","new_prev") := .SD[, c(curr,curr-1), with=FALSE], by = rowpos]
dt$mean <- rowMeans(subset(dt, select = c(new, new_prev)),na.rm=TRUE)
> dt
one two three four five six curr rowpos new new_prev mean
1: 100 105 108 116 122 134 3 1 108 105 106.5
2: 200 210 216 231 243 267 4 2 231 216 223.5
3: 300 315 324 347 364 400 5 3 364 347 355.5
4: 400 420 433 463 486 535 6 4 535 486 510.5
5: 500 525 541 579 608 669 3 5 541 525 533.0
6: 600 630 649 694 729 802 4 6 694 649 671.5