Question

在我的shell脚本中，我试图使用$ sourcefile中找到的术语一遍又一遍地搜索相同的$ targetfile。

我的$ sourcefile格式如下：

pattern1
pattern2
etc...

我必须搜索的低效循环是：

for line in $(< $sourcefile);do
    fgrep $line $targetfile | fgrep "RID" >> $outputfile
done

我知道可以通过将整个$ targetfile加载到内存中，或者使用AWK来改善这一点吗？

由于

Answer 1

我错过了什么，或者为什么不只是fgrep -f "$sourcefile" "$targetfile"？

Answer 2

sed解决方案：

sed 's/$.*$/\/\1\/p/' $sourcefile | sed -nf - $targetfile

这会将$ sourcefile的每一行转换为sed模式匹配命令：

的MatchString

到

/的MatchString / P

然而，您需要转义特殊字符才能使其变得健壮。

Answer 3

使用awk读取源文件，然后搜索targetfile（未经测试）：

nawk '
    NR == FNR {patterns[$0]++; next}
    /RID/ {
        for (pattern in patterns) {
            # since fgrep considers patterns as strings not regular expressions, 
            # use string lookup and not pattern matching ("~" operator).
            if (index($0, pattern) > 0) {
                print
                break
            }
        }
    }
' "$sourcefile" "$targetfile" > "$outputfile"

还会与gawk一起使用。

在shell脚本中优化grep（或使用AWK）

3 个答案: