Question

好的，所以我需要创建一个命令，在一个文本块中列出任何给定文件中最常用的100个单词。我现在所拥有的：

$ alias words='tr " " "\012" <hamlet.txt | sort -n | uniq -c | sort -r | head -n 10'

输出

$ words
     14 the
     14 of
      8 to
      7 and
      5 To
      5 The
      5 And
      5 a
      4 we
      4 that

我需要它以下列格式输出：

the of to and To The And a we that

（（注意，我怎么告诉它以全部大写字母打印输出？））

我需要更改它以便我可以将'words'传递给任何文件，因此初始输入不会在管道中指定文件，而是将文件命名为＆amp;管道将完成其余的工作。

Answer 1

好的，逐一取点，但不一定按顺序。

您可以通过删除words位来更改<hamlet.txt以使用标准输入，因为默认情况下tr将从标准输入获取其输入。然后，如果要处理特定文件，请使用：

cat hamlet.txt | words

或：

words <hamlet.txt

您可以通过制作管道的第一部分来删除大写字母的效果：

tr '[A-Z]' '[a-z]'

在做其他任何事情之前会降低你的输入。

最后，如果您使用整个管道（上面提到的修改），然后再传递一些命令：

| awk '{printf "%s ", $2}END{print ""}'

这将打印每行（单词）的第二个参数，后跟一个空格，然后打印一个带有终止换行符的空字符串。

例如，以下脚本words.sh将为您提供所需内容：

tr '[A-Z]' '[a-z]' | tr ' ' '\012' | sort -n | uniq -c | sort -r
    | head -n 3 | awk '{printf "%s ", $2}END{print ""}'

（在一行上：我将其拆分以便于阅读）按照以下记录：

pax> echo One Two two Three three three Four four four four | ./words.sh
four three two

您可以使用以下别名来实现相同目的：

alias words="tr '[A-Z]' '[a-z]' | tr ' ' '\012' | sort -n | uniq -c | sort -r
    | head -n 3 | awk '{printf \"%s \", \$2}END{print \"\"}'"

（同样，一行）但是，当事情变得复杂时，我更喜欢一个脚本，如果只是为了避免无法逃脱的转义字符： - ）