Question

我想从子目录递归导入文件（不同长度）并将它们放入一个data.frame中，其中一列包含子目录名称，一列包含文件名（减去扩展名）： / p>

e.g. folder structure
IsolatedData
  00
    tap-4.out
    cl_pressure.out
  15
    tap-4.out
    cl_pressure.out

到目前为止，我有：

setwd("~/Documents/IsolatedData")
l <- list.files(pattern = ".out$",recursive = TRUE)
p <- bind_rows(lapply(1:length(l), function(i) {chars <- strsplit(l[i], "/");
cbind(data.frame(Pressure = read.table(l[i],header = FALSE,skip=2, nrow =length(readLines(l[i])))),
      Angle = chars[[1]][1], Location = chars[[1]][1])}), .id = "id")

但我得到一个错误，说第43行没有2个元素。

还看到这个使用dplyr看起来整洁但我无法让它工作：http://www.machinegurning.com/rstats/map_df/

tbl <-
  list.files(recursive=T,pattern=".out$")%>% 
  map_df(~data_frame(x=.x),.id="id")

Answer 1

这是一个工作流程，其中包含来自{tidyverse中map的{{1}}个函数。

我生成了一堆csv文件来模仿你的文件结构和一些简单的数据。我在每个文件的开头扔了2行垃圾数据，因为你说你试图跳过前两行。

purrr

获得以下文件结构：

library(tidyverse)

setwd("~/_R/SO/nested")

walk(paste0("folder", 1:3), dir.create)

list.files() %>%
    walk(function(folderpath) {
        map(1:4, function(i) {
            df <- tibble(
                x1 = sample(letters[1:3], 10, replace = T),
                x2 = rnorm(10)
            )
            dummy <- tibble(
                x1 = c("junk line 1", "junk line 2"),
                x2 = c(0)
            )
            bind_rows(dummy, df) %>%
                write_csv(sprintf("%s/file%s.out", folderpath, i))
        })
    })

由于我使用str_extract执行了此操作，因此我得到了一个tibble，每次迭代的数据帧都被map_dfr编辑在一起。

rbind

由reprex package（v0.2.0）创建于2018-04-21。

Answer 2

你可以尝试：

#include <stdio.h>
#include <math.h>

int main()
{
    double radius; //radius of circle//
    double side_a; //side a of rectangle that will be asked//
    double num_pol; //number of sides to polygon//
    double side_b; //side a of rectangle that will be calculated//
    double S_circ; //surface of cercle//
    double S_rect; //surface of rectangle//
    double S_pol; //surface of polygon//
    double P=3.1416; //the number pi//    
    double diameter; // the diameter pf the circle//
    do
    {
        printf_s("enter a value for a:");
        scanf_s("%lf", &side_a);
    } while (side_a < 0);
    do
    {
    printf_s("enter a value for r:");
    scanf_s("%lf", &radius);
    diameter=radius*2;
    } while (radius < 0 && diameter < side_a);
}

Answer 3

我猜你的程序中你的“.out”文件是由一列数据组成的吗？如果是这样，您可以使用scan而不是read.table。我也猜测你想要一个名为Angle的列中的文件夹名称，名为Location的列中的文件名（减去扩展名），以及名为Pressure的列中的数据。如果这是正确的，以下应该有效：

setwd("~/Documents/IsolatedData")
l <- list.files(pattern = "\\.out$", recursive = TRUE)
p <- data.frame()
for (i in seq_along(l)){
  pt <- data.frame(Angle = strsplit(l[i], "/")[[1]][1],
                   Location = sub("\\.out", "", l[i]),
                   Pressure = scan(l[i], skip=2))
  p <- rbind(p, pt)
}

我知道给出一个只使用基数R的答案是不合时宜的，尤其是涉及循环的答案。但是，对于诸如迭代目录中的文件之类的东西，恕我直言，这是一个非常合理的事情，尤其是可读性和易于调试。当然，正如您所期望的那样，如果您正在处理大数据，那么在循环中使用rbind生成一个对象（或申请该问题）并不是一个好主意，但我怀疑这不是这种情况。

如何从子目录导入文件，并使用子目录名称R命名它们

3 个答案: