Question

我想知道如何才能完成以下任务：例如，我有一个包含以下内容的文件：

monkey
donkey
chicken
horse

我想对它做一个grep，所以grep "horse\|donkey\|chicken"，这会给我：

donkey
chicken
horse

但是，我真正想要的是以下内容：

horse
donkey
chicken

所以，我想按照我的“正则表达式”的顺序。我检查了手册页，但找不到任何参数。这可能吗（用grep）？

Answer 1

但是grep会按照输入中的出现顺序给出答案。正则表达式中子表达式的顺序与它无关。如果您真的想按顺序排列答案，可以将文件格式化三次：

for f in myfile
do
  grep horse $f
  grep donkey $f
  grep chicken $f
done

Answer 2

使用perl尝试此解决方案。它可能在很多方面失败并且有严重的限制，例如表达式中不超过9个替代或|。这是因为脚本围绕括号中的每个单词并在$1，$2等中查找匹配项。

script.pl的内容：

#!/usr/bin/env perl

use warnings;
use strict;

my (%matches, %words);

die qq|Usage: perl $0 <input-file> <regular-expression-PCRE>\n| unless @ARGV == 2;

my $re = pop;

## Assign an ordered number for each subexpression.
do {
    my $i = 0;
    %words = map { ++$i => $_ } split /\|/, $re;
};

## Surround each subexpression between parentheses to be able to select them
## later with $1, $2, etc.
$re =~ s/^/(/;
$re =~ s/$/)/;
$re =~ s/\|/)|(/g;

$re = qr/$re/;

## Process each line of the input file.
while ( <> ) { 
    chomp;

    ## If it matches any of the alternatives, search for it in any of the
    ## grouped expressions (limited to 9).
    if ( m/$re/o ) { 
        for my $i ( 1 .. 9 ) { 
            if ( eval '$' . $i ) { 
                $matches{ $i }++;
            }   
        }   
    }   
}

## Print them sorted.
for my $key ( sort keys %matches ) { 
    printf qq|%s\n|, $words{ $key } for ( 1 .. $matches{ $key } );
}

假设infile包含数据：

monkey
donkey
chicken
horse
dog
cat
chicken
horse

像以下一样运行：

perl script.pl infile 'horse|donkey|chicken'

产量：

horse
horse
donkey
chicken
chicken

Answer 3

您也可以使用awk。以下示例收集op数组中的匹配模式，并按END规则中的原始顺序输出：

图案有序grep.awk

BEGIN { split(patterns, p) }

{ 
  for(i=1; i<=length(p); i++)
    if($0 ~ p[i])
      op[p[i]] = $0
}

END {
  for(i=1; i<=length(p); i++)
    if(p[i] in op) 
      print op[p[i]]
}

像这样运行：

awk -v patterns='horse chicken donkey' -f pattern-ordered-grep.awk infile

输出：

horse
chicken
donkey

注意，如果您只想输出模式而不是匹配的行，请将最终代码行替换为print p[i]。

Answer 4

只需创建一个你想要的字符串数组，当你找到每个字符串时，继续检查数组中的下一个元素：

$ cat tst.awk
BEGIN{ numStrings = split("horse donkey chicken",strings) }
$0 == strings[numFound+1] { numFound++ }
numFound == numStrings { print "Found them all!"; exit }

$ cat file2           
monkey
horse
donkey
chicken

$ awk -f tst.awk file2
Found them all!

$ cat file            
monkey
donkey
chicken
horse

$ awk -f tst.awk file
$

Answer 5

这个怎么样？

cat file1.txt | grep -e horse -e donkey -e chicken | sort -r
horse
donkey
chicken

正则表达式的Grep顺序

5 个答案: