简化我的功能

时间:2017-06-26 10:12:53

标签: r function loops statistics

我有以下代码:

    z7 <- function(data, k, e){
  require(zoo)
  df = data
  r = df$ROA
  t = df$t
  EA = df$EA
  k = k
  e = e

  #Estimate rolling linear models
  models = rollapply(df, width = k, FUN = function(z) 
    coef(lm(r~t, data = as.data.frame(z))), by.column = FALSE, align ="right")

  #Extract residuals from the models
  res = rollapply(df, width= k, FUN = function(x) 
    residuals(lm(r~t, data = as.data.frame(x))), by.column = FALSE, align ="right")

  #Standard deviation and Mean of residuals, on a row basis
  s = as.data.frame(apply(res, 1, sd))
  m = as.data.frame(apply(res, 1, mean)) #note that this is aproximately 0 due to detrending.  

  #Combine the data define n as number of rows in the dataset
  dataset = cbind(models, res, m, s)
  n = as.vector(nrow(dataset))
  n
  dataset

  #Compute predictions at k+1
  for(i in n){
    x = k + 1
    preds = dataset$`(Intercept)` + dataset$t*(x)
    x = x + 1
  }

  #Compute coefficient of variation
  for(j in n){
    n2 = k +1 
    tau = ((1 + 1 / (4*(n2))) * (dataset$apply.res..1..sd./dataset$apply.res..1..mean.))
  }

  dataset3 = cbind(dataset, tau)
  dataset3
  #Compute mean of chi distribution and the adjusted standard deviation
  Mchi <- sqrt(2)*((gamma((k+1)/2))/gamma(k/2))
  S = s*Mchi*(k+1)/sqrt(k)

  #Compute z7, checking whether the adjusted sd or cv should be used
  for(i in nrow(dataset3)){
    if (abs(dataset3$tau*dataset3$preds) < e) {
      z = -(dataset3$EA + dataset3$preds) / S
    } else 
      z = -(dataset3$EA + dataset3$preds) /(dataset3$tau*dataset3$preds)
  }
}

值得注意的是,我正在创建一个能够创建调整后的标准化分数的函数。通常,Z分数定义为(x - mean)/ sd。

在这种情况下,我们考虑到x是一个非平稳的随机变量这一事实。因此,必须在滚动的基础上估计该措施,并在观察数量上反复构建。

df是感兴趣的数据集,k是用于估计滚动线性模型的窗口长度,e是用于测试调整后的标准偏差是否太小而不能使用变异系数而非替代的值根据异方差性调整的标准偏差。

当我使用以下测试措施运行我的函数时出现错误:

t = seq(0,15,1)
r = (100+50*sin(0.8*t))
EA = rnorm(0:15)
df = data.frame(t,r,EA)

test = z7(df, 3, 0.00000000001)

错误是:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 14, 0 

追溯是:

5.
stop(gettextf("arguments imply differing number of rows: %s", 
    paste(unique(nrows), collapse = ", ")), domain = NA) 
4.
data.frame(..., check.names = FALSE) 
3.
cbind(deparse.level, ...) 
2.
cbind(dataset, tau) 
1.
z7(df, 3, 1e-11) 

如何修复此错误?另外,有没有办法简化我的代码?

谢谢。

1 个答案:

答案 0 :(得分:0)

我认为错误发生在

tau = ((1 + 1 / (4*(n2))) * (dataset$apply.res..1..sd./dataset$apply.res..1..mean.))

我把它改成了

    tau = ((1 + 1 / (4*(n2))) * (dataset$`apply(res, 1, sd)`/dataset$`apply(res, 1, mean)`))

在上一个for循环中,我认为dataset3$preds

存在问题
>dataset3$preds
NULL

一开始你宣布r = df$ROA,但我认为这会将r设为NULL

希望这很有用!

问候

WW