Question

我在文件夹中有很多文件，并且希望为每个文件删除多个列并插入新列。我可以使用以下代码一次执行此文件：

df <- read.csv("C:\\Users\\name\\Documents\\CSV Files\\1\\30335\\file1.csv")
df <- df[-c(6:34)]
df$newcolumn <- df$column1-df$column2
write.table(df, file = "C:\\Users\\name\\Documents\\CSV Files\\1\\30335\\file1.csv",
sep = ",", dec = ".", col.names = T, row.names = F)

但是，我希望对文件夹中的所有文件进行一次运行。

提前感谢您的帮助。

Answer 1

首先，一些虚拟数据与

一起使用

for (i in seq_len(3)) {
  df <- data.frame(A = runif(10), B = runif(10), C = runif(10))
  fname <- paste0("./df", i, ".csv")
  write.csv(df, fname, row.names = FALSE)
}

好的，首先列出目录中的.csv个文件：

path <- "~/"
fs <- list.files(path, pattern = glob2rx("*.csv"))

给出了

R> fs
[1] "df1.csv" "df2.csv" "df3.csv"

接下来，循环遍历文件集

for (f in fs) {
  fname <- file.path(path, f)             ## current file name
  df <- read.csv(fname)                   ## read file
  df <- df[, -2]                          ## delete column B
  df$D <- df[, 1] + df[, 2]               ## add something
  write.csv(df, fname, row.names = FALSE) ## write it out
}

就是这样，但检查一下是否有效：

R> read.csv(file.path(path, fs[1]))
         A        C      D
1  0.71253 0.405461 1.1180
2  0.83507 0.353672 1.1887
3  0.61541 0.018851 0.6343
4  0.92108 0.006301 0.9274
5  0.07466 0.570673 0.6453
6  0.81803 0.160932 0.9790
7  0.50841 0.935930 1.4443
8  0.64912 0.965246 1.6144
9  0.31503 0.946411 1.2614
10 0.41563 0.212671 0.6283

完整的脚本是：

path <- "~/"
fs <- list.files(path, pattern = glob2rx("*.csv"))
for (f in fs) {
  fname <- file.path(path, f)             ## current file name
  df <- read.csv(fname)                   ## read file
  df <- df[, -2]                          ## delete column B
  df$D <- df[, 1] + df[, 2]               ## add something
  write.csv(df, fname, row.names = FALSE) ## write it out
}

glob2rx()调用将文件模式glob转换为正则表达式，以便仅选择具有.csv扩展名的文件。如果您知道正则表达式，您可以自己编写，但glob2rx()对于那些不会说正则表达式的人来说是一个很好的捷径。

基本上，上面的解决方案和Sven的答案非常相似。我更喜欢这里的循环方法，因为创建一个匿名函数，尽管一点都不困难，但是从实际问题中删除了一步，就是一个接一个地执行一系列步骤，在我看来，这是最清楚地证明的通过循环。但这纯粹是个人偏好。

对于您的具体示例，未进行测试，因为我没有您的设置，您需要：

path <- "C:\\Users\\name\\Documents\\CSV Files\\1\\30335"
fs <- list.files(path, pattern = glob2rx("*.csv"))
for (f in fs) {
  fname <- file.path(path, f)               ## current file name
  df <- read.csv(fname)                     ## read file
  df <- df[, -(6:34)]                       ## delete columns
  df$D <- df[, "column1"] + df[, "column2"] ## add new column
  write.csv(df, fname, row.names = FALSE)   ## write it out
}

Answer 2

这是一种方法：

path <- "C:\\Users\\name\\Documents\\CSV Files\\1\\30335"

files <- list.files(path = path)

lapply(files, function(file) {
  fp <- file.path(path, file)
  df <- read.csv(fp)[-6:34]
  df$newcolumn <- df$column1 - df$column2
  write.table(df, file = fp,
              sep = ",", dec = ".", col.names = TRUE, row.names = FALSE)
})

删除文件夹中所有文件的变量？

2 个答案: