我有几个txt文件。每个文件都有以逗号分隔的数据列。每个都有自己的文件名。
到目前为止,我已将这些文件合并为一个大数据框,使用以下代码:
files = list.files()
data2=lapply(files, read.table, header=FALSE, sep=",")
data_rbind <- do.call("rbind", data2)
colnames(data_rbind)[c(1,2,3)]<-c("name", "sex", "amount")
返回:
姓名性别
Anna F 24567
Emma F 23210Isabelle F 31212
Amanda F 22631
我想添加第4列,它指定每行数据旁边的数据,该数据最初源自的文件的名称。
因此,例如,如果第一个文件&#39; example1.txt&#39;包含以下内容:
Anna,F,24567
Emma,F,23210Isabelle,F,31212
第二个文件&#39; example2.txt&#39;包含以下内容:
Amanda,女,22631
Sara,女,41355
凯蒂,女,2387
我想得到以下内容:
姓名性别年度
Anna F 24567 example1.txt
Emma F 23210 example1.txt
Amanda F 22631 example2.txt
Sara F 41355 example2.txt
Katie F 2387 example2.txt
这可能吗?
答案 0 :(得分:3)
尝试:
files = list.files()
data2=lapply(files, read.table, header=FALSE, sep=",")
for (i in 1:length(data2)){data2[[i]]<-cbind(data2[[i]],files[i])}
data_rbind <- do.call("rbind", data2)
colnames(data_rbind)[c(1,2,3,4)]<-c("name", "sex", "amount","year")
答案 1 :(得分:2)
你也可以使用:
nm1 <- c("Name", "Sex", "Amount", "Year")
files <- list.files(pattern="^example")
files
#[1] "example1.txt" "example2.txt"
setNames(do.call(rbind,Map(`cbind`,
lapply(files, read.table, sep=","), V4=files)), nm1)
# Name Sex Amount Year
#1 Anna F 24567 example1.txt
#2 Emma F 23210 example1.txt
#3 Isabelle F 31212 example1.txt
#4 Amanda F 22631 example2.txt
#5 Sara F 41355 example2.txt
#6 Katie F 2387 example2.txt
或使用rbindlist
data.table
library(data.table)
setnames(rbindlist(Map(`cbind`,lapply(files, fread),files)),nm1)[]
# Name Sex Amount Year
#1: Anna F 24567 example1.txt
#2: Emma F 23210 example1.txt
#3: Isabelle F 31212 example1.txt
#4: Amanda F 22631 example2.txt
#5: Sara F 41355 example2.txt
#6: Katie F 2387 example2.txt
答案 2 :(得分:0)
您可以尝试以下内容:
data2 = lapply(files, function(x) {
res <- read.table(x, header=FALSE, sep=",")
res$year <- x
res
}, header=FALSE, sep=",")
data_rbind <- do.call("rbind", data2)
colnames(data_rbind) <- c("name", "sex", "amount", "year")