Question

我在做运动时遇到一些问题：基本上，分配是打开一个URL，将其转换为给定的格式，并计算文本中给定字符串的出现次数。

import urllib2 as ul 

def word_counting(url, code, words):
    page = ul.urlopen(url)
    text = page.read()
    decoded = ext.decode(code)
    result = {}

    for word in words:
        count = decoded.count(word)
        counted = str(word) + ":" + " " + str(count)
        result.append(counted)

    return finale

我得到的结果就像“word1：x，word2：y，word3：z”，其中x，y，z是出现的次数。但似乎我只获得一个数字，当我尝试运行测试程序时，我得到的结果只有第一次出现时为9，第二次出现时为14，第三次出现时为5，缺少其他事件和整个计数值。我究竟做错了什么？提前致谢

Answer 1

您没有正确附加到词典。

正确的方法是result[key] = value。

所以你的循环就是

for word in words:
  count = decoded.count(word)
  result[word] = str(count)

没有解码但使用.count()

的示例

words = ['apple', 'apple', 'pear', 'banana']
result= {}
  for word in words:
    count = words.count(word)
    result[word] = count

>>> result
>>> {'pear': 1, 'apple': 2, 'banana': 1}

Answer 2

不要忘记列表和字典理解。它们可以在更大的数据集上非常高效（特别是如果您在分析示例中的大型网页时）。在一天结束时，如果你的数据集很小，可以说dict理解语法更清晰/更pythonic等。

所以在这种情况下我会使用类似的东西：

result = {word : decoded.count(word) for word in words}

Answer 3

或者您可以使用Collections.Counter：

>>> from collections import Counter
>>> words = ['apple', 'apple', 'pear', 'banana']
>>> Counter(words)
Counter({'apple': 2, 'pear': 1, 'banana': 1})

使用url和字符串计数的Python练习

3 个答案: