Question

我正在尝试grep查找另一个文件中不存在的文件中的单词

grep -v -w -i -r -f "dont_use_words.txt" "list_of_words.txt" >> inverse_match_words.txt


uniq -c -i inverse_match_words.txt | sort -nr

但我在uniq命令中获得了重复值。为什么这样？

我想知道是否可能因为grep区分字符串，比如“GIRLAAA”，“AAABOY”，“GIRLAAABOY”中的“AAA”，因此，我最终会重复。

当我做grep -F "AAA"时，所有这些都会被退回。

如果有人可以帮我解决这个问题，我会很感激。我是Linux OS的新手。

Answer 1

uniq会删除每组连续重复行中除一行外的所有行。因此，使用它的传统方法是首先通过sort传递输入。你没有这样做，所以是的，完全有可能（非连续的）重复将保留在输出中。

示例：

grep -v -w -i -f dont_use_words.txt list_of_words.txt \
  | sort -f \
  | uniq -c -i \
  | sort -nr