Question

我想找到至少有3个单词的所有行＆＃34;＆＃34;。

我知道如何使用正则表达式找到这些行：

grep -E "(the)(\s(.+)\s\1){2,}" file.txt

它有效，grep找到这样的线条。但我的问题是：是否有可能只突出单词＆＃34;＆＃34;而不是第一个和最后一个＆＃34;＆＃34;？

之间的整个文本

换句话说，我不想找到所有＆＃34;＆＃34;＆＃34;文本中的单词，但只有那些至少有3＆＃34;＆＃34;并仅突出显示这些词语以使其更具可读性。

我试图使用https://www.regular-expressions.info/refadv.html中的内容比如(?=)，但它不起作用：

grep -E "(the)((?=\s(.+)\s)\1){2,}" file.txt

文本：

the cat
in the garden there was the cat
in the box there is the cat and the dog and the bird
aaa the bbb the ccc the ddd

当前输出：

in the box there is the cat and the dog and the bird
aaa the bbb the ccc the ddd

理想的输出：

in the box there is the cat and the dog and the bird
aaa the bbb the ccc the ddd

Answer 1

你可以将一个grep传递给另一个：

<!-- Correct -->
$('#totalNumber')
document.getElementById('totalNumber')
<span id="totalNumber">...</span>

<!-- Incorrect -->
document.getElementById('#totalNumber')
<span id="#totalNumber">...</span>

<强>输出：

在框中有猫和狗和鸟 aaa bbb ccc ddd`

第一个grep -E '(\bthe\b.*?){3}' file | grep --color '\bthe\b'找到包含至少3个完整单词grep和第2个the的所有行，只为每个grep单词添加颜色。

Answer 2

这是一个awk，它计算每一行的单词，并将计数为三或更多的单词加粗：

$ awk '
BEGIN {
    b="\033[1m"
    n="\033[0m"
}
{
    delete a
    for(i=1;i<=NF;i++)
        # if(lenght($i)==3)  # uncomment this to consider three-letter words only
        a[$i]++
    for(i=1;i<=NF;i++) 
        printf "%s%s%s%s",(a[$i]>=3?b:""),$i,(a[$i]>=3?n:""),$i==NF?ORS:OFS)
}' file

猫在花园里有猫在框中有猫和狗和鸟 aaa bbb ccc ddd

如果您只想考虑三个字母的字词，请在if(length($i)==3)之前添加a[$i]++。

修改

我错过了只打印粗体线的部分。现在修好了：

$ awk ' BEGIN { b="\033[1m" n="\033[0m" } { for(i=1;i<=NF;i++) if(length($i)==3) a[$i]++ for(i=1;i<=NF;i++) buf=buf (i>1?OFS:"") (a[$i]>=3&&(f=1)?b:"") $i (a[$i]>=3?n:"") if(f) print buf delete a; buf=""; f="" }' file

在框中有猫和狗和鸟 aaa bbb ccc ddd

如何在同一行中找到3个或更多单词并使用grep仅突出显示它们而不是整个文本？

2 个答案: