R中的回归循环用于数据帧

时间:2015-06-01 21:50:42

标签: r loops statistics dataframe regression

rm(list=ls())
myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") 
for(i in names(myData))
{
    colNum <- grep(i,colnames(myData)) ##asigns a value to each column 
    if(is.numeric(myData[3,colNum]))  ##if row 3 is numeric, the entire column is 
   {
        ##print(nxeData[,i])        
        fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'
        rsq <- summary(fit)$r.squared   
   }
}
  

我正在为多个列做回归循环,并将它们与一个因变量列进行比较。我编写了大部分代码,但现在我不确定如何在包含该列名称的同时打印出每个列的R平方值与etch_source_Avg参数。理想情况下它会是这样的:

     

.765“变量名1”

     

.436“变量名2”......依此类推

1 个答案:

答案 0 :(得分:1)

这里是对代码的快速重写,这应该可以为您提供所需内容。因为myData应该是data.frame,所以不必为每列分配值,因此您可以使用它的列名访问每一列。

rm(list=ls())
myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") 
for(i in names(myData))
{ 
    if(is.numeric(myData[3,i]))  ##if row 3 is numeric, the entire column is 
    {       
       fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'
       rsq <- summary(fit)$r.squared
       writelines(paste(rsq,i,"\n"))
    }
}

希望这有帮助。