R矢量不打印预期输出

时间:2016-02-27 23:27:41

标签: r

corr <- function(directory, threshold = 0){

  #get all the cases that have non-NA values
  complete_cases <- complete(directory)
  #get all the observations over the threshold amount
  filter_cases <- complete_cases[complete_cases[["nobs"]] > threshold, ]

  #The returned data frame contains two columns "ID" and "nobs"

  #get all file names in a vector
  all_files <- list.files(directory, full.names=TRUE)

  correlation <- vector("numeric")

  for(i in as.numeric(filter_cases[["ID"]])){
    #get all the files that are in the filter_cases
    output <- read.csv(all_files[i])
    #remove all NA values from the data
    output <- output[complete.cases(output), ]
    #get each of the correlations and store them
    correlation[i] <- cor(output[["nitrate"]], output[["sulfate"]])
  }

  correlation
}

我对此的预期结果如下:

corr("directory", 200)

[1] -1.023 0.0456 0.8231 etc

我得到的是:

NA NA -1.023 NA NA
NA NA NA 0.0456 NA
0.8231 NA NA NA NA etc

我觉得这是一个简单的东西,我缺少印刷品(cor(输出[[“硝酸盐”]],输出[[“硫酸盐”]]))基本上得到了我所期望的。当我计划在其他函数中使用函数时,输出必须是向量。

1 个答案:

答案 0 :(得分:1)

在我看来,您的问题可能是由于for循环的索引造成的。这导致跳过相关矢量的一些条目,因此被设置为NA。如果不能访问您的数据,很难确定,但似乎上面一行的目的是让您只循环访问某些文件。如果是这种情况,由于您将for循环用于两个目的,因此使相关索引使用显式计数器可能是有意义的,如下所示。

cor_index = 0 
for(i in as.numeric(filter_cases[["ID"]])){
    #get all the files that are in the filter_cases
    output <- read.csv(all_files[i])
    #remove all NA values from the data
    output <- output[complete.cases(output), ]
    #get each of the correlations and store them
    cor_index = cor_index + 1 
    correlation[cor_index] <- cor(output[["nitrate"]], output[["sulfate"]])
}