Question

  1 import sys
  2 import string
  3 from collections import Counter
  4 
  5 def count_words(input_file_path, *w_f):
  6     tab = dict.fromkeys([ord(i) for i in string.punctuation], u' ')
  7     words = []
  8     count = 0
  9     frequency = Counter()
 10     with open(input_file_path, "r") as fp:
 11         for line in fp.readlines():
 12             linei = line.translate(tab)
 13             words = linei.split()
 14             count += len(words)
 15             for item in words:
 16                 frequency[item] += 1
 17         if w_f:
 18             with open(w_f, "w") as wfp:
 19                 to_write = fp.read()
 20                 wfp.write(to_write)
 21 
 22     sorted(frequency.items())
 23     print "total word count %d" %  count
 24     print "Frequency "
 25     print frequency
 26 
 27 if __name__ == '__main__':
 28     if len(sys.argv) == 2:
 29         count_words(sys.argv[1])
 30     elif len(sys.argv) == 3:
 31          count_words(sys.argv[1], sys.argv[2])
 32     else: raise Exception("Insufficient Arguments")

我写了一个计算单词的程序。错误是一个类型错误。它说translate需要一个缓冲区对象。我猜这与unicode有关。

确切的问题是什么？

Answer 1

来自documentation on str.translate：

使用表格翻译字符，必须是256个字符的字符串

然而，你正在尝试使用dict。要创建有效的翻译表，请查看str.maketrans。

translate python中的字符缓冲区对象错误

1 个答案: