Question

我试图在匹配模式之后将所有字段删除到行尾，并且我想将下一个字段打印到模式中。可能有多种模式。

示例：

one two three four five six seven
robin mike luke jennifer jessie mark
...

模式：

two
jennifer

输出：

one two three
robin mike luke jennifer jessie
...

我试过了：

cat file | sed -E 's/(.+ two|jennifer) .+/\1 /'
one two
robin mike luke jennifer

但是我想念下一个领域。

Answer 1

由于看起来您可以访问GNU工具，我建议使用grep：

grep -Eo '.*\b(two|jennifer)(\s+\S+)?' file

这匹配到现场和＃34;两个＆＃34;或者＆＃34; jennifer＆＃34;，然后是下一个字段，如果存在的话。感谢@123提供了有用的建议。

-o仅打印行的匹配部分，-E启用扩展正则表达式。

Answer 2

在awk中：

$ awk 'NR==FNR{a[$1];next}{for(i=1;i<=NF;i++) if($i in a) NF=((i+1)>NF?NF:(i+1))} 1' pats ex
one two three
robin mike luke jennifer jessie

其中pats是模式文件，ex是示例记录文件。说明：

NR==FNR {                           # process pattern file
    a[$1]                           # store all patterns into a hash
    next                            # skip to next record
}
{
    for(i=1;i<=NF;i++)              # for each word in example file record
        if($i in a)                 # check if found in a
            NF=((i+1)>NF?NF:(i+1))  # if found, cut record after the next word
} 1                                 # print the record

目前程序检查是否在散列a中找到了单词。这意味着在处理第一条记录时，会检查two和jennifer。如果这不合适，可以通过替换

轻松处理

第二行：a[$i]至a[FNR]=$1和
第七行：if($i in a)至if($i==a[FNR])

如何在匹配正则表达式的下一个单词后删除所有内容？

2 个答案: