我有一系列CSV文件,我想准备一起追加。我的附加文件很大,所以我想将一些字符串变量转换为单个文件中的数字和日期格式,而不是更大的附加文件。
使用其他软件,我会有一个for
循环打开文件和嵌套for
循环,这些循环会迭代某些变量组。对于这个项目,我试图使用R和apply
函数。
我有mapply
和lapply
个独立运作的功能。我现在正试图弄清楚如何将它们结合起来。我可以筑巢吗? (参见下面的独立部分和嵌套。)
(此代码引用How do I update data frame variables with sapply results?的答案中的代码)
(习惯上提供一个示例CSV来提供可重现的示例吗?R是否有内置的示例CSV?)
这些单独工作:
insert.division <- function(fileroot, divisionname){
ext <- ".csv"
file <- paste(fileroot, ext, sep = "")
data <- read.csv(file, header = TRUE, stringsAsFactors = FALSE)
data$division <- divisionname
write.csv(data, file = paste(fileroot, "_adj3", ext, sep = ""),
row.names = FALSE)
}
files <- c(
"file1",
"file2",
"file3",
"file4",
"file5"
)
divisions <- c(1:5)
#Open the files, insert division name, save new versions
mapply(insert.division, fileroot = files, divisionname = divisions)
#Change currency variables from string to numeric
currency.vars <- c(
"Price",
"RetailPrice"
)
df[currency.vars] <- lapply(
df[currency.vars],
function(x) as.numeric(sub("^\\(","-", gsub("[$,]|\\)$","", x)))
)
合并版本:
file.prep <- function(fileroot, divisionname, currency.vars){
ext <- ".csv"
file <- paste(fileroot, ext, sep = "")
data <- read.csv(file, header = TRUE, stringsAsFactors = FALSE)
data$division <- divisionname
df[currency.vars] <- lapply(
df[currency.vars],
function(x) as.numeric(sub("^\\(","-", gsub("[$,]|\\)$","", x)))
)
write.csv(data, file = paste(fileroot, "_adj", ext, sep = ""),
row.names = FALSE)
}
#Open the files, insert division name, change the currency variables,
#save new versions
mapply(file.prep, fileroot = files, divisionname = divisions,
currency.vars = df[currency.vars])
答案 0 :(得分:1)
我不确定你为什么在更改数据后将其写回文件,但这是我如何处理问题的一个例子。
## Set up three csv files
set.seed(1)
DF <- data.frame(
w = paste0("($", sample(1500, 30) / 100, ")"),
x = Sys.Date() + 0:29,
y = sample(letters, 30, TRUE),
z = paste0("($", sample(1500, 30) / 100, ")")
)
fnames <- paste0("file", 1:3, ".csv")
Map(write.csv, split(DF, c(1, 10, 20)), fnames, row.names = FALSE)
使用file.prep()
功能,您可以稍微调整一下
file.prep <- function(fileroot, divname, vars) {
ext <- ".csv"
file <- paste0(fileroot, ext)
data <- read.csv(file, stringsAsFactors = FALSE)
data$division <- divname
data[vars] <- lapply(data[vars], function(x) {
type.convert(gsub("[()$]", "", x))
})
write.csv(data, row.names = FALSE, file = paste0(fileroot, "_adj", ext))
}
divname <- 1:3
fnames <- paste0("file", divname)
Map(file.prep, fnames, divname, MoreArgs = list(vars = c("w", "z")))