我想循环遍历数据帧中的变量,在每个变量上调用lm(),所以我写了这个:
findvars <- function(x = samsungData, dv = 'activity', id = 'subject') {
# Loops through the possible predictor vars, does an lm() predicting the dv
# from each, and returns a data.frame of coefficients, one row per IV.
r <- data.frame()
# All varnames apart from the dependent var, and the case identifier
ivs <- setdiff(names(x), c(dv, id))
for (iv in ivs) {
print(paste("trying", iv))
m <- lm(dv ~ iv, data = x, na.rm = TRUE)
# Take the absolute value of the coefficient, then transpose.
c <- t(as.data.frame(sapply(m$coefficients, abs)))
c$iv <- iv # which IV produced this row?
r <- c(r, c)
}
return(r)
}
这不起作用,我相信b / c lm()调用中的公式由函数局部变量组成,这些变量在传入的数据帧中包含命名变量的字符串(例如,&#34; my_dependant_var&# 34;和&#34; this_iv&#34;)而不是指向实际变量对象的指针。
我尝试在eval(parse(text =))中包装该公式,但无法使其工作。
如果我对这个问题是正确的,有人可以向我解释如何让R解决这些变量的内容iv&amp; dv进入我需要的指针?或者,如果我错了,有人可以解释还有什么事吗?
非常感谢!
以下是一些重复代码:
library(datasets)
data(USJudgeRatings)
findvars(x = USJudgeRatings, dv = 'CONT', id = 'DILG')
答案 0 :(得分:3)
除了你的公式问题之外,你的功能中还有足够多的坏事发生,我认为有人应该引导你完成所有这些。以下是一些注释,然后是更好的版本:
#For small examples, "growing" objects isn't a huge deal,
# but you will regret it very, very quickly. It's a bad
# habit. Learn to ditch it now. So don't inititalize
# empty lists and data frames.
r <- data.frame()
ivs <- setdiff(names(x), c(dv, id))
for (iv in ivs) {
print(paste("trying", iv))
#There is no na.rm argument to lm, only na.action
m <- lm(dv ~ iv, data = x, na.rm = TRUE)
#Best not to name variables c, its a common function, see two lines from now!
# Also, use the coef() extractor functions, not $. That way, if/when
# authors change the object structure your code won't break.
#Finally, abs is vectorized, no need for sapply
c <- t(as.data.frame(sapply(m$coefficients, abs)))
#This is probably best stored in the name
c$iv <- iv # which IV produced this row?
#Growing objects == bad! Also, are you sure you know what happens when
# you concatenate two data frames?
r <- c(r, c)
}
return(r)
}
尝试这样的事情:
findvars <- function(x,dv,id){
ivs <- setdiff(names(x),c(dv,id))
#initialize result list of the appropriate length
result <- setNames(vector("list",length(ivs)),ivs)
for (i in seq_along(ivs)){
result[[i]] <- abs(coef(lm(paste(dv,ivs[i],sep = "~"),data = x,na.action = na.omit)))
}
result
}