我可以看到这几乎已经完成,但我是R的新手,无法弄明白。差不多,我有一个回归循环(请不要批评数据挖掘),我需要从每个循环报告一些新的列表/数据框/最适合的东西。这是我的代码:
#Required packages
require(lattice)
require(plyr)
ACDn <- "ACDn.csv"
x <- as.matrix(read.csv(ACDn, colClasses = "numeric"))
#To find which columns are which, in order to split up the overall dataset.
which( colnames(X)=="H95.H30" )
which( colnames(X)=="H99" )
#Here i split all the data into their different types, i.e. LHt = Lidar Heights. Please ignore
#those that are unpopulated, as i am waiting on data to run.
Yall <- x[,c(59:79)] #All "True Variables" - BA, MTH, etc.
Y <- Yall[,10] #Specifies which columnn is the Y variable, BA = 10,
#TopHt = 11, SPH = 12, Vol_live = 13, RecovVol = 14
X <- x[,c(1:58,80:95)] #All Lidar metrics and combinations.
LHt <- X[,c(28:41,59:74)]
LCv <- X[,c()]
LKu <- X[,c()]
LSk <- X[,c()]
L?? <- X[,c()]
#Create List file. I
Optmod1 <-
#Loop Creation, need dataset sizes. The ?? are caused by not knowing the exact sizes
#of the relative datasets yet. Somewhere in here i would like the an entry for EACH model to be
#appended to a data.frame (or list, whatever is most appropriate), which would state the variables
# i.e. 'y', 'i', 'j', 'k', 'l', 'm', and the Adj. R-squared value (which i guess can be extracted
# through using 'summary(mod)$adj.r.squared).
For(i in 1:30) {
For(j in 1:??) {
For(k in 1:??) {
For(l in 1:??){
For(m in 1:??){
mod <- lm(Y ~ LHt[i] + LCv[j] + LKu[k] + LSk[l] + L??[m])
}
}
}
}
}
所以,在'mod'每次运行之后,我只需要它抛出'Y','i','j','k','l','m'和调整后的。 R-Squared(我猜通过使用“summary(mod)$ adj.r.squared”)到一个可提取的表中。
很抱歉,如果这是r-illiterate,我是新手,并且刚刚获得了规定的代码,因此我的基本理解很少。
谢谢你的时间!
P.S。随意提出任何问题 - 我会尽力回答它们!
答案 0 :(得分:1)
您问题的简短回答是
Answers = list()
For(i in 1:30) {
For(j in 1:??) {
For(k in 1:??) {
For(l in 1:??){
For(m in 1:??){
mod <- lm(Y ~ LHt[i] + LCv[j] + LKu[k] + LSk[l] + L??[m])
Answers[[length(Answers)+1]] = list(i,j,k,l,m,summary(mod)$adj.r.squared)
}
}
}
}
}
将把您想要的信息存储在列表中。它的工作原理是创建一个空白列表,然后每次在循环中运行回归模型时都会附加到该列表中。但是,在循环中增加这样的列表是非常糟糕的R实践。
首先将表单LHt[i] + LCv[j] + LKu[k] + LSk[l] + L??[m]
的所有可能公式写入列表,然后使用lapply进行回归可能会更好......
首先使用expand.grid
为数据框提供5列,每列包含每个类别中的一个变量名称
LHT_names = lapply(1:30,function(i) paste("LHt[",i,"]",sep="")) #a list of names of LHT type variables for use in formula
LCv_names = lapply(1:?,function(i) paste("LCv[",i,"]",sep="")) #similar for LCv
LKu_names = ...
LSk_names = ...
L??_names = ...
temp = expand.grid(c(LHt_names, LCv_names, LKu_names, LSk_names, L??_names))
然后,使用粘贴和lapply获取公式列表:
list_of_formulas = lapply(seq_along(nrow(temp)), function(i) paste("Y~",paste(temp[i,],collapse="+"),sep = ""))
然后,使用lapply获取回归模型列表
list_of_models = lapply(list_of_formulas, function(x) lm(x) )