在缺失值的加权数据上运行speedlm

时间:2016-11-22 08:11:05

标签: r speedglm

我正在尝试对加权数据进行线性回归 使用speedlm时,如果数据中缺少值,则会收到错误消息。

 library(speedglm)
 sampleData <- data.frame(w = round(runif(12,0,1)),
                          target = rnorm(12,100,50),
                          predictor = c(NA, rnorm(10, 40, 10),NA))

 summary(sampleData)
       w              target          predictor    
 Min.   :0.0000   Min.   : -3.381   Min.   :22.58  
 1st Qu.:0.0000   1st Qu.: 48.321   1st Qu.:30.45  
 Median :1.0000   Median : 84.156   Median :37.09  
 Mean   :0.5833   Mean   : 92.306   Mean   :35.03  
 3rd Qu.:1.0000   3rd Qu.:119.891   3rd Qu.:41.96  
 Max.   :1.0000   Max.   :223.896   Max.   :43.48  
                                    NA's   :2
 #run linear regression without weights
 linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
 speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)

 #run linear regression with weights
 linearWithWeights <- lm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )
 speedLinearWithWheights <- speedlm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )
Error in base::crossprod(x, y) : non-conformable arguments
In addition: Warning messages:
1: In sqw * X :
  longer object length is not a multiple of shorter object length
2: In sqw * y :
  longer object length is not a multiple of shorter object length
Called from: base::crossprod(x, y)

有没有办法在运行回归之前不强迫我修复数据?

1 个答案:

答案 0 :(得分:1)

您应该尝试更改na.action选项。当我将na.action更改为na.exclude/na.omit时,以下是我能够运行的代码。

library(speedglm)
sampleData <- data.frame(w = round(runif(12,0,1)),
                         target = rnorm(12,100,50),
                         predictor = c(NA, rnorm(10, 40, 10),NA))
summary(sampleData)

linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)

options(na.action="na.exclude") # or "na.omit"

linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
    speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)

您可以浏览na.omitna.exclude的文档,了解何时使用内容。希望这会有所帮助。