我想使用data.table的列名作为公式的输入。但是,每次我直接插入行的名称都可以。如果我从对象加载名称,它不起作用。我认为这与
的事实有关library(data.table)
dt <- data.table(ID= c(1,2,3,4,5,6,7,8,9),
var1 = c(100,150,200,180,10,15,11,25,1),
var2 = c(150,200,250,300,15,20,19,30,2),
var3 = c(100,101,102,103,104,105,106,107,109))
# Insert column name direvtly in Formular seems to work
dt[, var1 := ( var1 - mean(var1, na.rm = TRUE)/sd(var1, na.rm = TRUE)) ]
# Load name from formular does not work
Names <- c("var1", "var2", "var3")
for (i in 1:3){
dt[, Names[i] := ( Names[i] - mean(Names[i], na.rm = TRUE)/sd(Names[i], na.rm = TRUE)) ]}
我认为这与Names [1]给我“var1”而不是var1的事实有关。我在论坛中寻找类似的问题并找到了一些命令,如as.symbol()
,as.name()
,虽然似乎没有帮助。
答案 0 :(得分:2)
一个选项是使用get
来获取对象的值
for (i in 1:3){
dt[, (Names[i]) := ( get(Names[i]) - mean(get(Names[i]),
na.rm = TRUE)/sd(get(Names[i]), na.rm = TRUE)) ]
}
或另一个选项是set
for(j in Names){
set(dt, i = NULL, j = j, value = (dt[[j]] - mean(dt[[j]],
na.rm = TRUE)/sd(dt[[j]], na.rm = TRUE)))
}
dt
# ID var1 var2 var3
#1: 1 99.05324836 149.060863 64.52132
#2: 2 149.05324836 199.060863 65.52132
#3: 3 199.05324836 249.060863 66.52132
#4: 4 179.05324836 299.060863 67.52132
#5: 5 9.05324836 14.060863 68.52132
#6: 6 14.05324836 19.060863 69.52132
#7: 7 10.05324836 18.060863 70.52132
#8: 8 24.05324836 29.060863 71.52132
#9: 9 0.05324836 1.060863 73.52132
或者在Names
中指定.SDcols
,循环遍历Data.table的子集,进行计算并将输出分配(;=
)回Names
中的列}
dt[, (Names) := lapply(.SD, function(x) x- mean(x, na.rm = TRUE)/sd(x,
na.rm = TRUE)), .SDcols = Names]