read.csv正在修改输入文件名

时间:2015-10-02 15:33:07

标签: r csv

我正在尝试将目录作为输入传递给函数,并将其用作read.csv的输入以读取CSV文件。但是,在此过程中,read.csv正在修改在运行时发送的文件名字符串。

目录:" C:/ SAT / Self Courses / R / data / specdata" 在这个目录中,有许多我希望阅读并使用以下功能处理的CSV文件

complete<-function(directory,id=1:332)
{

  gFull<-c()
  ids<-str_pad(id,3,pad="0")
  idExt<-paste(ids,".csv",sep="")
  dir<-paste(directory,idExt,sep="/")

  for(i in dir)
  {

    tableTemp<- read.csv(i,header=T)
    tableTemp<- na.omit(tableTemp)
    gFull<-c(gFull,nrow(tableTemp))
  }
  output<-data.frame(id,gFull,stringsAsFactors = F)
  return(output)
}  

cor_sub<-function(data,directory)
{
  #print(directory)
  id<-data[1]
  id<-str_pad(id,3,pad="0")
  id<-paste(id,".csv",sep="")
  #print(id)
  dir_temp<-paste(directory,id,sep="/")
  print(dir_temp)
  #read table
  input<-read.csv(dir_temp,header=T)
  input<-na.omit(input)
  #correlation
  return (cor(input$sulfate,input$nitrate))
}


cor<-function(directory,threshold=0)
{
  #find the thresholds of each file
  qorum<-complete(directory,1:12)
  print(threshold)
  qorum$gFull[qorum$gFull<threshold]<-NA
  qorum<-na.omit(qorum)
  v_cor<-apply(qorum,1,cor_sub,directory)
  #(v_cor)

 }

我通过电话

执行此代码
cor("C:/SAT/Self Courses/R/data/specdata",0)

我得到的错误输出是

> cor("C:/SAT/Self Courses/R/data/specdata",0)
[1] 0
[1] "C:/SAT/Self Courses/R/data/specdata/001.csv"
 Show Traceback

 Rerun with Debug
 Error in file(file, "rt") : cannot open the connection In addition: Warning message:
In file(file, "rt") :
  cannot open file '7.21/001.csv': No such file or directory

问题是dir_temp:我有&#34; C:/ SAT / Self Courses / R / data / specdata / 001.csv&#34;但是在下一行read.csv正在接受输入&#39; 7.21 / 001.csv&#39;

如果问题看似微不足道,请耐心等待,我仍处于新手模式:)

1 个答案:

答案 0 :(得分:1)

看看这是否对您有用(我忽略了您迄今为止尝试过的大部分代码,因为它似乎不必要地复杂且无法运行):

results <- list()
threshold <- 0  # file must have this many lines to be processed
filepath <- "C:/SAT/Self Courses/R/data/specdata"
filenames <- list.files(filepath)  # assumes you want all files in directory
suppressWarnings(
for(filename in filenames) {

    # construct the path for this particular file, and read it
    fpath <- paste(filepath, filename, sep="/")
    input <- read.csv(fpath, header=TRUE)

    # check if threshold is met, skip if not
    if(nrow(input) <= threshold)) next

    input <- na.omit(input)  # do you want this before the threshold check?

    # store our correlation in our results list
    # stats::cor() to avoid confusion with your defined function
    results[[filename]] <- stats::cor(input$sulfate, input$nitrate)
})

print(results)

如果您对下面的工作原理有任何疑问,请告诉我(我实际上并没有这样做,tbh)。你应该可以从这里拿出来并根据你的需要进行概括。