Question

我在下面给出了代码。我想打印每个单词及其出现次数，而不使用keylist = filter(None, (re.findall(clientname,k) for k in self.cacheDictionary)) # Python3: if you want to persist a list # keylist = list(filter(None, (re.findall(clientname,k) for k in self.cacheDictionary))) for items in keylist: print(items)，ks = [k for k in self.cacheDictionary if clientName in k] for k in ks: self.cacheDictionary.pop(k) # del self.cacheDictionary[k]，dict，等外部工具。

我可以计算单词的总数，但是这里我也有一个问题：在输出中我没有得到总字数，输出小于它应该是。

我该怎么办？

self.cd = {k: v for k, v in self.cd.items() if clientName not in  k}

Answer 1

您可以使用关联数组来计算单词，有点像这样：

$ cat foo.sh
#!/bin/bash                                                                     

declare -A words

while read line
do
    for word in $line
    do
        ((words[$word]++))
    done
done

for i in "${!words[@]}"
do
    echo "$i:" "${words[$i]}"
done

测试它：

$ echo this is a test is this | bash foo.sh
is: 2
this: 2
a: 1
test: 1

这个答案几乎是根据这些优秀的答案构建的：this和this。不要忘记对它们进行投票。

Answer 2

James Brown's answer的两个改进版本（考虑一个单词的标点符号，并打破双引号和单引号组）：

标点符号被视为单词的一部分：

#!/bin/bash
declare -A words

while read line ; do
    for word in ${line} ; do
        ((words[${word@Q}]++))
done ; done

for i in ${!words[@]} ; do
    echo ${i}: ${words[$i]}
done

标点不是单词的一部分，（如wc）：

#!/bin/bash
declare -A words

while read line ; do
    line="${line//[[:punct:]]}"
    for word in ${line} ;do 
        ((words[${word}]++))
done ; done

for i in ${!words[@]} ;do
    echo ${i}: ${words[$i]}
done

经过测试的代码，带有棘手的引用文字：

fortune -m "swear" | bash foo.sh
man bash | ./foo.sh | sort -gr -k2 | head

使用纯`bash`打印每个单词及其出现次数

2 个答案: