R:输出功能,输出中的因子数量错误

时间:2016-05-27 05:08:24

标签: r output

我是一个完整的R菜鸟,我正在尝试学习R trough Coursera ...我正在尝试编写一个函数来计算332个不同的csv文件的平均值。我得到了正确的值,但输出错误。我应该得到其中一个因素的平均值,但我得到两个因素的平均值。

#Assign the directory
pollutantmean <- function (directory, pollutant, id = 1:332) {
  
  directory <- list.files(path= "/Users/......./specdata")

  #Create empty vector 
  g <- list()

  #For loop to run through the files and get info and use rbind to create df
  for(i in 1:length(directory)) {

    g[[i]] <- read.csv(directory[i],header=TRUE)

  }

  rbg <- do.call(rbind,g)

  #Subset to get the sulfate/nitrate columns and calcualte the mean
  pollutant <- subset(rbg,ID %in% id ,select = c("sulfate","nitrate"))
  colMeans(pollutant,na.rm = TRUE) 
  
}

pollutantmean("specdata","sulfate",70:72) 

sulfate   nitrate 
0.9501894 1.7060474 

到目前为止,这么好......价值是正确的。然而,问题在于,由于我将“硫酸盐”传递到污染物中,所以我应该只获得硫酸盐含义。但是,相反,我得到了两者。这是为什么?我在这做错了什么?

谢谢,

1 个答案:

答案 0 :(得分:0)

正确查看代码,您已将c("sulfate", "nitrate")硬编码到函数中。传递给函数的变量不需要以这种方式进行硬编码。

将您的功能更改为以下内容:

#Assign the directory
pollutantmean <- function (directory, pollutant, id = 1:332) {
  
  directory_files <- list.files(path = directory)

  #Create empty vector 
  g <- list()

  #For loop to run through the files and get info and use rbind to create df
  for(i in 1:length(directory_files)) {

    g[[i]] <- read.csv(directory_files[i],header=TRUE)

  }

  rbg <- do.call(rbind,g)

  #Subset to get the sulfate/nitrate columns and calcualte the mean
  pollutant_subset <- subset(rbg,ID %in% id ,select = pollutant)
  colMeans(pollutant_subset,na.rm = TRUE) 
  
}

pollutantmean("/Users/......./specdata","sulfate",70:72) 

sulfate
0.9501894

你现在应该得到正确的结果。