使用awk组合来自多个文件的列

时间:2015-09-23 21:31:57

标签: awk

我在这样的文件夹中有一组文件:

hsa-miR-106a-5p.filtered.txt   hsa-miR-182-5p.filtered.txt   hsa-miR-2467-5p.filtered.txt   hsa-miR-421.filtered.txt      hsa-miR-592.filtered.txt
hsa-miR-106b-3p.filtered.txt   hsa-miR-183-3p.filtered.txt   hsa-miR-25-3p.filtered.txt     hsa-miR-424-3p.filtered.txt   hsa-miR-615-3p.filtered.txt
hsa-miR-106b-5p.filtered.txt   hsa-miR-183-5p.filtered.txt   hsa-miR-25-5p.filtered.txt     hsa-miR-424-5p.filtered.txt   hsa-miR-625-3p.filtered.txt
hsa-miR-1180-3p.filtered.txt   hsa-miR-188-5p.filtered.txt   hsa-miR-27a-3p.filtered.txt    hsa-miR-431-5p.filtered.txt   hsa-miR-625-5p.filtered.txt
hsa-miR-1246.filtered.txt      hsa-miR-18a-3p.filtered.txt   hsa-miR-27a-5p.filtered.txt

文件:

ENSG00000224531.4       SMIM13  ENST00000416247.2       9606    hsa-miR-135b-5p 3       132     139     -0.701  99      -0.701  99
ENSG00000112357.8       PEX7    ENST00000541292.1       9606    hsa-miR-135b-5p 3       428     435     -0.683  99      -0.640  99
ENSG00000138279.11      ANXA7   ENST00000372921.5       9606    hsa-miR-135b-5p 3       205     212     -0.631  99      -0.631  99
ENSG00000135248.11      FAM71F1 ENST00000315184.5       9606    hsa-miR-135b-5p 3       488     495     -0.581  99      -0.581  99
ENSG00000087302.4       C14orf166       ENST00000556760.1       9606    hsa-miR-135b-5p 3       34      41      -0.566  99      -0.566  99
ENSG00000104722.9       NEFM    ENST00000433454.2       9606    hsa-miR-135b-5p 3       25      32      -0.565  99      -0.565  99
ENSG00000132485.8       ZRANB2  ENST00000254821.6       9606    hsa-miR-135b-5p 3       284     291     -0.566  99      -0.565  99
ENSG00000185127.5       C6orf120        ENST00000332290.2       9606    hsa-miR-135b-5p 3       125     132     -0.564  99      -0.553  99

我想合并所有文件的行,以便获得一个包含所有行的输出文件。 我一直在玩R,但我决定尝试awk:

for f in *.filtered.txt
do
 ....
done

1 个答案:

答案 0 :(得分:1)

也许这可以帮到你(R解决方案):

files <- dir(pattern="\\.txt$") # Get the field list and put it in a vector
df <- NA # Initialize the variable that will hold the data frame
for(f in files){
  if(is.na(df)) {                  # If the dataframe is empty ...
    df <- read.table(f)            # ... read the first file in the list
  } else {                         # ... otherwise ...
    df <- rbind(df, read.table(f)) # ... bind the dataframe with the rows
                                   # from the next file
  }
}