我正在尝试创建一个函数,该函数将为数据集中的一组选定列创建新的标记指示符列。
# Data Set
A = as.factor(c(0,2,1,0))
B = as.factor(c(2,NA,1,0))
C = as.factor(c(1,0,NA,0))
D = as.factor(c(NA,2,0,1))
dat = data.table(A, B, C, D)
目前我正在为我想要的每一栏做这件事:
# What I'm currently doing (expected output of loop matches these columns)
attach(dat)
VAR = B
dat$b.test[VAR == "0"] <- "0"
dat$b.test[VAR == "1" | VAR == "2"] <- "1"
VAR = C
dat$c.test[VAR == "0"] <- "0"
dat$c.test[VAR == "1" | VAR == "2"] <- "1"
VAR = D
dat$d.test[VAR == "0"] <- "0"
dat$d.test[VAR == "1" | VAR == "2"] <- "1"
detach(dat)
似乎我应该能够创建一个for循环,它将运行一个包含我希望逻辑在(B,C,D)上执行的所有列的向量,并从另一个向量中调用一个新名称(b2) ,c2,d2)在每个向量内的相同位置。
尝试方法1
# Failed method 1
attach(dat)
new.var = c(b2, c2, d2)
cur.var = c(B, C, D)
l = length(cur.var)
for(i in 1:l){
X = cur.var[i]
VAR = cur.var[i]
dat$X[VAR == "0"] <- "0"
dat$X[VAR == "1" | VAR == "2"] <- "1"
}
detach(dat)
新专栏X的结果
尝试方法2
# Failed method 2
new.var = c(dat$b2, dat$c2, dat$d2)
cur.var = c(dat$B, dat$C, dat$D)
l = length(cur.var)
for(i in 1:l){
new.var[i] = ifelse(new.var[i] == "0", "0",
ifelse(new.var[i] == "1" | "2", "1", NA)
)
}
是否有不同的方法可以尝试这样做?
答案 0 :(得分:1)
你真的不需要循环。您可以使用.SD变量迭代列。例如
change<-c("B","C","D")
myfun <- function(x) ifelse(x==0, "0", "1")
dat[,paste0(change,".test") := Map(myfun, .SD), .SDcols=change]
# A B C D B.test C.test D.test
# 1: 0 2 1 NA 1 1 NA
# 2: 2 NA 0 2 NA 0 1
# 3: 1 1 NA 0 1 NA 0
# 4: 0 0 0 1 0 0 1