Question

我有一个包含这些模式的文件

word word2
word
word word
word wordword

我需要计算所有只是'word'而不是'word2'或wordword的单词。

我试过

$ grep 'word[^a-ZA-Z0-9 | $]' testWordCount.txt       
$ grep 'word[^a-ZA-Z0-9]' testWordCount.txt    
$ grep 'word[$| ]' testWordCount.txt

很抱歉，如果其中一些没有意义。我正在学习正则表达式。很抱歉不包括用于正则表达式的工具。

Answer 1

使用以下正则表达式匹配行：

/\bword\b/

\b是word boundary anchor，它会匹配单词的开头，单词的结尾，行的开头或行的结尾。

您可以在RegexPal。

测试此表达式

我看到你正在使用grep - 这个正则表达式引擎使用\<和\>转义为字边界。

/\<word\>/

此外，这里是你如何计算bash中的所有实例：

cat testWordCount.txt | tr ' ' '\n' | grep -c '\<word\>'

Answer 2

egrep -o在一行上打印匹配的标记，最后可以轻松计算。 \b表示边界或类似内容。

egrep -o "\bword\b" words.txt | wc