Question

我正在尝试使用bash命令获取大型txt文件中各种字符串的计数。

即。使用bash找到字符串'pig'，'horse'和'cat'的计数，得到一个输出'pig：7，horse：3，cat：5'。我想要一种方法只搜索一次txt文件，因为它非常大（所以我不想通过整个txt文件搜索'pig'，然后返回并搜索'horse'等）

任何有关命令的帮助都将不胜感激。谢谢！

Answer 1

grep -Eo 'pig|horse|cat' txt.file | sort | uniq -c | awk '{print $2": "$1}'

将其分解成碎片：

grep -Eo 'pig|horse|cat'  Print all the occurrences (-o) of the
                          extended (-e) regex 
sort                      Sort the resulting words
uniq -c                   Output unique values (of sorted input)
                          with the count (-c) of each value
awk '{print $2": "$1}'    For each line, print the second field (the word)
                          then a colon and a space, and then the first
                          field (the count).

Bash在大文件中查找多个字符串的计数

1 个答案: