我想遍历自变量,并使用data.table将它们回归到因变量上。由于我的数据集很大,因此我需要一个有效的解决方案。我发现以mtcars
数据框为例的建议:
library(data.table)
Fits <- as.data.table(mtcars)[, list(MyFits = lapply(.SD[, -1, with = F], function(x) summary(lm(mpg ~ x))))]
我首先在自己的一些数据集中尝试了此方法,但收效甚微。然后,我尝试将其应用于mtcars本身,给出了以下意外结果:变量MyFits的10行,每行都类似于下面的示例。
list(call = lm(formula = mpg ~ x), terms = mpg ~ x, residuals = c(0.370164348925326, 0.370164348925409, -3.58141592920354, 0.770164348925413, 3.82174462705436, -2.52983565107458, -0.578255372945635, -1.98141592920354, -3.58141592920354, -1.42983565107459, -2.82983565107459, 1.52174462705436, 2.42174462705436, 0.321744627054363, -4.47825537294564, -4.47825537294564, -0.178255372945637, 6.01858407079646, 4.01858407079646, 7.51858407079646, -4.88141592920354, 0.621744627054364, 0.321744627054363, -1.57825537294564, 4.32174462705436, 0.918584070796464, -0.381415929203536, 4.01858407079646, 0.921744627054365, -0.929835651074587, 0.121744627054364, -4.98141592920354), coefficients = c(37.8845764854614, -2.87579013906447, 2.07384360552423, 0.322408882659104, 18.2678078445963, -8.91969884745751, 8.36915530493018e-18, 6.11268714258098e-10), aliased = c(FALSE, FALSE), sigma = 3.20590203190608, df = c(2, 30, 2), r.squared = 0.726180005093805, adj.r.squared = 0.717052671930265, fstatistic = c(79.5610275293349, 1, 30 ), cov.unscaled = c(0.418457648546144, -0.0625790139064475, -0.0625790139064475, 0.0101137800252844)
)
答案Linear Regression loop for each independent variable individually against dependent的作者已经提到答案需要更新,但是我没有弄清楚出什么问题了。
有什么建议吗?