Question

我在具有不同后缀的文件夹中有多个.csv文件。例如：

Data_Software
Data_Hardware
Data_Manufacturing ....

＆安培;类似的许多其他.csv文件。我想在每个数据集中创建一个新列，说“type”，它将包含相应文件的后缀，即; Data_Software中类型列的所有观察都应该说软件，Data_Hardware应该有硬件。

有人可以帮忙吗？

Answer 1

试试这个，抱歉，假设它们是您环境中的data.frames，情况并非如此，请随意忽略/建议更改：

# Data frames in your environment
Data_Tom <- iris
Data_Dick <- iris
Data_Harry <- iris

# Get the names of the objects
objs <- ls(pattern = "Data_")

# Add the suffix as a the column
objs <- lapply(objs, 
               function(x){
                 type <- gsub("Data_", "", x)
                 df <- get(x)
                 cbind(df, Type = type)
               })

# Combine them together, you might not need this
combine <- do.call(rbind, objs)

Answer 2

虽然我不建议，但我可能会这样做：

library(data.table) # need for fread and :=

# Get a list of all files in the directory 
my_dir <- "my_path_here"    
FILES <- list.files(path = my_dir, pattern="*.csv$", full.names = TRUE, recursive = FALSE)

# Read every file
lapply(FILES, function(x) { assign(gsub(paste0(my_dir,"/|\\.csv$|Data_"),"",x),fread(x, header = T)[, Type := gsub(paste0(my_dir,"/|\\.csv$|Data_"),"",x)], envir = .GlobalEnv)})

这为每个csv创建一个表 - 该表的名称与文件名称相同，剥离扩展名，路径和Data_。它还会在读取

时创建一个包含表名的列

基于dataframe后缀的动态列创建

2 个答案: