Question

我试图通过仅搜索与字符串匹配的子目录中的文件来找出如何加快递归grep的速度。

示例：

/foo/bar/baz/mylogs/somelog.log
/foo/bar/notme.log
/cat/dog/mylogs/anotherlog.log

我只需要grep *.log下的*/mylogs/*个文件

我可以找到具有以下内容的所有日志文件...

egrep -h -R --include \*.log '(pattherns|to|match)'

但这不是...

egrep -h -R --include \/mylogs/\*.log '(pattherns|to|match)'

如何缩小包含路径？

Answer 1

一种选择是使用find来查找与该名称匹配的任何目录，并利用其exec标志对找到的内容运行egrep。以下示例递归egrep遍历模式“ foo”的匹配目录：

find . -type d -name mylogs -exec egrep -hR --include=\*.log foo {} +

后缀+意味着find将尽可能多的结果馈送到单个egrep进程。

Answer 2

grep具有--exclude-dir选项，但没有--include-dir选项。 --include选项仅允许您匹配文件名，而不是文件路径

如果您使用的是bash shell或支持extglob的其他shell的较新版本，也可以执行以下操作：

$ shopt -s extglob 
$ # mylogs without sub-dirs
$ ls **/mylogs/*.log
cat/dog/mylogs/anotherlog.log  foo/bar/baz/mylogs/somelog.log

$ # if mylogs can have sub-dirs as well
$ ls **/mylogs/**/*.log
a/b/mylogs/c/d/f.log  cat/dog/mylogs/anotherlog.log  foo/bar/baz/mylogs/somelog.log

一旦您满意文件是否根据需要进行匹配，请将grep与glob一起使用

grep -h '(pattherns|to|match)' **/mylogs/*.log

Answer 3

切勿使用任何GNU grep选项来查找文件，因为它们是完全不必要的，而只是使用大量文件搜索选项和实际g / re / p选项来打乱对grep的调用。保持简单，并使用find来查找文件，并使用grep来文件中的g / re / p ：

find . -type d -name mylogs -print0 |
xargs -0 -I XX find XX -maxdepth 1 -type f -name '*.log' -exec grep -h 'foo' {} +

或者如果文件名不包含换行符，甚至可以执行以下操作：

find . -type f -name '*.log' | grep '/mylogs/' | xargs grep -h 'foo'

递归grep仅匹配模式的目录吗？

3 个答案: