Question

我编写了一些应该找到我所有的.txt文件的代码（它们是ODE模拟的输出），将它们全部打开作为数据框用＆＃34; read.table＆＃34;然后对它们进行一些计算。

files <- list.files(path="/Users/redheadmammoth/Desktop/Ultimate_Aging_F2016",
                pattern=".txt",full.names=TRUE)
ldf <- lapply(files, read.table)
tuse <- seq(from=0,to=100,by=0.1)

for(files in ldf)
  findR <- function(r){
    with(files,(sum(exp(-r*age)*fecund*surv*0.1)-1)^2)
    }
    {
        R0 <- with(files,(sum(fecund*surv*age)))
        GenTime <- with(files,(sum(tuse*fecund*surv*0.1))/R0)
        r <- optimize(f=findR,seq(-5,5,.0001),tol=0.00000001)$minimum
        RV <- with(files,(exp(r*tuse)/surv)*(exp(-r*tuse)*(fecund*surv))) 

plot(log(surv) ~ age,files,type="l")
tmp.lm <- lm(log(surv) ~ age + I(age^2),files) #Fit log surv to a quadratic
lines(files$age,predict(tmp.lm),col="red")
}

然而，问题是它似乎只是执行我的＆＃34; for＆＃34;中包含的计算。循环一个文件，而不是所有文件。我希望对我的所有文件执行计算，然后将所有文件保存为一个大数据框，以便我可以访问任何特定模拟集的结果。我怀疑错误是我没有正确索引文件以循环遍历所有文件。

Answer 1

如何使用plyr :: ldply（）来实现此目的。它需要一个列表（在您的情况下是您的文件列表）并对它们执行相同的功能，然后返回一个数据框。

要记住要做的主要事情是为您读入的每个文件的ID创建一列，以便您知道哪些数据来自哪个文件。最简单的方法是将其称为文件名，然后您可以从那里编辑它。

如果你的函数中有其他参数，它们会在ldply中使用你想要使用的函数。

        # create file list
        files <- list.files(path="/Users/redheadmammoth/Desktop/Ultimate_Aging_F2016",
                            pattern=".txt",full.names=TRUE)
        tuse <- seq(from=0,to=100,by=0.1)

        load_and_edit <- function(file, tuse){
            temp <- read.table(file)

            # here put all your calculations you want to do on each file
            temp$R0 <- sum(temp$fecund*temp$surv*temp*age)

            # make a column for each file name so you know which data comes from which file
            temp$id <- file

            return(temp)
           }


        new_data <- plyr::ldply(list.files, load_and_edit, tuse)

这是我发现批量读入和处理多个文件的最简单方法。

然后你可以很容易地绘制每一个。

循环一组文件

1 个答案: