Question

我想以任意顺序搜索一大堆文件中的一组单词，有或没有空格或标点符号。所以，例如，如果我搜索hello, there, friend，它应匹配

hello there my friend
friend, hello there
theretherefriendhello

但不是

hello friend
there there friend

我无法找到任何方法来做到这一点。甚至可以使用grep或grep的某些变体吗？

Answer 1

您可以使用sed：

sed -n '/word1/{/word2/{/word3/p;};}' *.txt

Answer 2

是否可以使用grep或grep的某些变体？

您可以使用grep -P即Perl模式，以下正则表达式。

^(?=.*hello)(?=.*there)(?=.*friend).*$

参见演示。

Answer 3

为此，我会像这样使用awk：

awk '/hello/ && /there/ && /friend/' file

这会检查当前行是否包含所有字符串：hello，there和friend。如果发生这种情况，则会打印该行

为什么？因为条件为True，当某些内容为True时awk的默认行为是打印当前行。

Answer 4

在基本和扩展RE中，如果不使用像Perl RE这样的特定于供应商或版本的扩展，您需要使用以下内容来处理：

egrep  -lr 'hello.*there.*friend|hello.*friend.*there|there.*hello.*friend|there.*friend.*hello|friend.*hello.*there|friend.*there.*hello' /path/

请注意-l选项只告诉您文件名，并-r告诉grep递归搜索。此解决方案应该适用于您可能遇到的grep的几乎所有变体。

这在RE方面显然不够优雅，但在使用grep的内置递归搜索方面却很方便。如果RE困扰您，我会使用awk或sed代替，如果可以的话，请用find包裹：

find /path/ -exec awk '/hello/&&/there/&&/friend/ {r=1} END {exit 1-r}'\; -print

同样，这是一个文件列表，而不是一个行列表。您可以根据自己的具体要求进行调整。