如何将一个文件中与另一个文件的行匹配的所有字符串移动到输出文件中的列?

时间:2015-09-30 16:58:43

标签: unix awk grep

我有两个文件,每个文件都有一个如下所示的列:

档案1

chr1 106623434
chr1 106623436
chr1 106623442
chr1 106623468
chr1 10699400
chr1 10699405
chr1 10699408
chr1 10699415
chr1 10699426
chr1 10699448
chr1 110611528
chr1 110611550
chr1 110611552
chr1 110611554
chr1 110611560

文件2

chr1 1066234
chr1 106994
chr1 1106115

我想用文件2的每一行搜索文件1,并拉出具有完整字符串并放入新文件的每一行。我希望每个搜索输出都在由标签分隔的自己的列或行中。我想对文件2中的每一行执行此操作。希望输出看起来像这样:

chr1 106623434  chr1 10699400   chr1 110611528
chr1 106623436  chr1 10699405   chr1 110611550
chr1 106623442  chr1 10699408   chr1 110611552
chr1 106623468  chr1 10699415   chr1 110611554
                chr1 10699426   chr1 110611560
                chr1 10699448     

1 个答案:

答案 0 :(得分:4)

$ cat tst.awk
NR==FNR { tgts[++numTgts] = $0; next }
{
    for (tgtNr=1; tgtNr<=numTgts; tgtNr++) {
        tgt = tgts[tgtNr]
        if ($0 ~ "^"tgt) {
            numHits[tgtNr]++
            maxHits = (numHits[tgtNr] > maxHits ? numHits[tgtNr] : maxHits)
            hits[tgtNr,numHits[tgtNr]] = $0
        }
    }
}
END {
    for (hitNr=1; hitNr<=maxHits; hitNr++) {
        for (tgtNr=1; tgtNr<=numTgts; tgtNr++) {
             printf "%-16s%s", hits[tgtNr,hitNr], (tgtNr<numTgts?OFS:ORS)
        }
    }
}

$ awk -f tst.awk file2 file1
chr1 106623434   chr1 10699400    chr1 110611528
chr1 106623436   chr1 10699405    chr1 110611550
chr1 106623442   chr1 10699408    chr1 110611552
chr1 106623468   chr1 10699415    chr1 110611554
                 chr1 10699426    chr1 110611560
                 chr1 10699448