在多个文件中合并相等的部分

时间:2015-10-20 11:50:16

标签: r

我有3个文件: 文件号1是(F1.txt):

"se" "su"
"1" "<{FBB rr e2}>" 0.227701005025126
"2" "<{FB EX PR}>" 0.286903266331658
"3" "<{FB}>" 0.121309673366834

文件编号2是(F2.txt):

"se" "su"
"1" "<{FBB rr e2}>" 0.28881
"2" "<{FB EX PR}>" 0.273897
"3" "<{FB}>" 0.19998

文件编号3是(F3.txt):

"se" "su"
"1" "<{FB EX PR}>" 0.256758
"2" "<{FBB rr e2}>" 0.299991
"3" "<{FB423}>" 0.17890

我想创建一个长文件,在每行的开头会有一个指示“su”字段出现在哪个文件中,以及下面出现的列表:

输出文件(OT.txt):

  "se" "su"
    <{FBB rr e2}> in files: 1, 2, 3 
    file1: "1" "<{FBB rr e2}>" 0.227701005025126
    file2: "1" "<{FBB rr e2}>" 0.28881
    file3: "2" "<{FBB rr e2}>" 0.299991
    <{FB EX PR}> in files 1,2,3:
    file1: "2" "<{FB EX PR}>" 0.286903266331658
    file2: "2" "<{FB EX PR}>" 0.273897
    file3: "1" "<{FB EX PR}>" 0.256758
    <{FB}> in files: 1,2 
    file1: "3" "<{FB}>" 0.121309673366834
    file2: "3" "<{FB}>" 0.19998
    <{FB423}> in files: 3
    file3:"3" "<{FB423}>" 0.17890

1 个答案:

答案 0 :(得分:2)

这有帮助吗?

#combining data
F1 <- read.table(text='"se" "su"
"1" "<{FBB rr e2}>" 0.227701005025126
"2" "<{FB EX PR}>" 0.286903266331658
"3" "<{FB}>" 0.121309673366834', header=T)
F1$file <- 1
F2 <- read.table(text='"se" "su"
                 "1" "<{FBB rr e2}>" 0.28881
                 "2" "<{FB EX PR}>" 0.273897
                 "3" "<{FB}>" 0.19998', header=T)

F2$file <- 2
F3 <- read.table(text='"se" "su"
"1" "<{FB EX PR}>" 0.256758
"2" "<{FBB rr e2}>" 0.299991
"3" "<{FB423}>" 0.17890'
)

F3$file <- 3


#make one big datafile

FF <- do.call(rbind,list(F1,F2,F3))
str(FF)

#sort

res <- FF[with(FF, order(se,su)),]
res

#or with writing to file- very hacky
outputfile <- "OT.txt"
lapply(split(FF,FF$se),function(x){
  current_name=unique(x$se)
  header = sprintf("%s in files %s\n",unique(x$se), paste(sort(x$file),collapse=", "))
  cat(header,file=outputfile,append=T)
  write.table(x[order(x$su),c("se","su")],file=outputfile,append=T,col.names=F)
}
)