检查文本文件中的两个变量

时间:2014-04-24 14:55:35

标签: shell text-processing

我有一个包含列中几个数字的文件:(numbers.txt)

2
5
126
3005
65

还有另一个文本文件,它是这样的:(input.txt)

#    126    2    0
bla    mjnb     kjh    ojj
#    5    65    0
kjh    jhgg    kjhkjh    juh
hgj    ikaw    esd     cdqw
#    100    3005    0
jhgjh    jh    jhjhg    pol
第一行很重要。在#之后编写的两个数字都应该在number.txt中,我已经编写了以下代码,但是我的巨大文件需要几个星期。我的numbers.txt包含大约2500个数字。

    #!/bin/bash
    cat numbers.txt | while read first
    do
    for second in $(cat numbers.txt)
    do
    awk -v RS="#" "/ $first    $second / {sub(/^ /,RS);print}"input.txt >> output1.txt
    done
    done
输出应该是:

#    126    2    0
bla    mjnb     kjh    ojj
#    5    65    0
kjh    jhgg    kjhkjh    juh
hgj    ikaw    esd     cdqw

有人可以提供更快的方式来达到输出吗?

2 个答案:

答案 0 :(得分:1)

使用awk可以不创建嵌套循环:

awk 'FNR==NR{a[$0];next} $1=="#" && ($2 in a) && ($3 in a) {p=1}
           $1=="#" && (!($2 in a) || !($3 in a)) {p=0} p' file1 file2
#    126    2    0
bla    mjnb     kjh    ojj
#    5    65    0
kjh    jhgg    kjhkjh    juh
hgj    ikaw    esd     cdqw

答案 1 :(得分:1)

awk '
    # read the numbers file into the array "num"
    NR == FNR {num[$1]; next} 

    # if this is a "#" line and the first 2 numbers are in "num" set a flag to "true"
    $1 == "#" {p = (($2 in num) && ($3 in num))} 

    # print the current line if the flag is true
    p
' numbers.txt input.txt