每个因变量中缺少数字的多元回归

时间:2013-10-18 12:08:12

标签: r lm

我所拥有的数据集包括1380个对冲基金的月度回报,但大多数基金都缺少数据。我想将每一只基金的月回报率回归到一些因素,如国债收益率(TBY)。我尝试使用for循环将每个资金的月度回报回归到因子,但收到以下错误消息:

#Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
#  0 (non-NA) cases

我在互联网上做了一些搜索,并认为问题是由列表删除引起的。我复制了一个简单的案例来说明:

#create a dataframe A with 8 funds and two factors
A<-data.frame(fund1=rnorm(5),fund2=rnorm(5),fund3=rnorm(5),fund4=rnorm(5),
              fund5=rnorm(5),fund6=rnorm(5),fund7=rnorm(5),fund8=rnorm(5),
              SP500=rnorm(5),TBY=rnorm(5))

#replace some vlaue with NA
A[1,3:5]<-NA
A[2,1:2]<-NA
A[3,3]<-NA
A[4,2:4]<-NA
A[5,1]<-NA
A[1:5,7]<-NA
A

# build two data frames to split funds and factors
funds<-as.data.frame(A[,1:8])
factors<-as.data.frame(A[,9:10])
# build empty data frame to store regression outputs
results<-data.frame(matrix(NA,ncol=4,nrow=8))
colnames(results)<-c("estimates", "residual", "t", "p")
rownames(results)<-as.vector(colnames(funds))

for(i in 1:8){
  fit<-lm(as.vector(funds[,i])~TBY,data=factors,na.action=na.omit)
  results[i,1]<-coef(summary(fit))[1,1]
  results[i,2]<-coef(summary(fit))[1,2]
  results[i,3]<-coef(summary(fit))[1,3]
  results[i,4]<-coef(summary(fit))[1,4]
  }
results

最终结果如下:

   results
   #        estimates  residual          t         p
   # fund1  0.1039720 0.2486456  0.4181535 0.7478621
   # fund2 -0.1040939 0.2464246 -0.4224168 0.7455554
   # fund3  0.3869647       NaN        NaN       NaN
   # fund4  0.1349445 0.2107588  0.6402796 0.6374377
   # fund5  0.7470140 0.4066014  1.8372147 0.2075786
   # fund6  0.8305238 0.3845686  2.1596245 0.1196180
   # fund7         NA        NA         NA        NA
   # fund8         NA        NA         NA        NA

程序停止在fund7循环。我认为主要原因是fund7的列仅包含NA s,因此循环无法继续。任何人都可以给我一些建议,让程序在这种情况下继续吗?我希望得到的结果是每个回归模型的常数。您的意见将非常感谢。

感谢。

1 个答案:

答案 0 :(得分:0)

try中包裹循环体将允许它在发生错误后继续。此外,您可以像这样一次分配整行results

for(i in 1:8)try({fit<-lm(as.vector(funds[,i])~TBY,data=factors,na.action=na.omit)
          results[i,]<-coef(summary(fit))[1,]
         })
## Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
##   0 (non-NA) cases
results
##        estimates  residual          t         p
## fund1  0.1977773 0.1949221  1.0146478 0.4953715
## fund2  0.7192174 0.2862573  2.5124861 0.2411462
## fund3  2.9271787       NaN        NaN       NaN
## fund4  0.8757588 2.1261925  0.4118906 0.7512633
## fund5 -0.3371507 0.5472105 -0.6161262 0.6005921
## fund6  0.2844758 0.3068079  0.9272114 0.4222080
## fund7         NA        NA         NA        NA
## fund8 -0.2380825 0.2613918 -0.9108262 0.4295420

顺便说一句,您完全使用sapplytryCatch

来避免循环
sapply(funds,function(x)
       tryCatch(coef(summary(lm(x ~ TBY,data=factors,na.action=na.omit)))[1,],
                error= function(x)rep(NA,4)))

##                fund1     fund2    fund3     fund4      fund5     fund6 fund7
## Estimate   0.1977773 0.7192174 2.927179 0.8757588 -0.3371507 0.2844758    NA
## Std. Error 0.1949221 0.2862573      NaN 2.1261925  0.5472105 0.3068079    NA
## t value    1.0146478 2.5124861      NaN 0.4118906 -0.6161262 0.9272114    NA
## Pr(>|t|)   0.4953715 0.2411462      NaN 0.7512633  0.6005921 0.4222080    NA
##                 fund8
## Estimate   -0.2380825
## Std. Error  0.2613918
## t value    -0.9108262
## Pr(>|t|)    0.4295420