无法将值添加到python字典并写入文件

时间:2015-02-14 15:05:23

标签: python dictionary

我正在尝试检查dict中是否存在单词。如果不是,我会将keyvalue添加到dict

mydict = {}    
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
      for Word in filei:
        Word = Word.split()
        if not Word in dict:
            dict[Word] = 1
        elif Word in dict:
            dict[Word] = dict[Word] + 1
    print [unicode(i) for i in dict.items()] 

它抛出以下错误:

if not Word in dict:
TypeError: unhashable type: 'list'

如果我删除Word = Word.split()部分它可以工作,但会考虑整行。那对我没用。我想把你能看到的每一个字都计算在内。

3 个答案:

答案 0 :(得分:4)

Word = Word.split()会使Word成为一个列表,并且您不能将list(或任何其他不可用类型)作为字典键。

您应该考虑使用collections.Counter,但要略微修改现有代码:

with io.open("fileo.txt", "r", encoding="utf-8") as filei:
    d = dict()
    for line in filei:
        words = line.strip().split()
        for word in words:
            if word in d:
                d[word] += 1
            else:
                d[word] = 1
    print d
    print [unicode(i) for i in d.items()] 

答案 1 :(得分:1)

由于您对单词进行了拆分,因此可以使用for循环进行检查和计数:

words = Word.split()
for word in words:
    if not word in dict:
        ...

但由于你只是在计算单词,我建议改为使用Counter

来自集合导入计数器

with io.open("fileo.txt", "r", encoding="utf-8") as f:
    word_count = Counter()
    for line in f:
        words = line.strip.split()
        word_count.update(words)
    print [unicode(word) for word in d.most_common(100)]

这会计算独特的单词并在最后打印100个最常用的单词。

它可以写得更短(如果你的文件不是太大,因为整个文件一次读取):

with io.open("fileo.txt", "r", encoding="utf-8") as f:
    word_count = Counter(word.strip() for word in f.read().split())

答案 2 :(得分:1)

如果您不想导入并使用defaultdictCounter dict,请使用dict.setdefault并避免使用if/else。使用单词string作为键:

dct = {}    
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
      for line in filei:
          words = line.split()
          for word in words:
              word = word.lower() 
              # if key does not exist add it and set a default value of 0
              dct.setdefault(word, 0)
              dct[word] += 1 #  increment the count

对变量使用小写名称,不要使用dict作为变量名,因为它会影响python dict。我认为你认为Wordword是相同的,所以你需要在每个单词上调低,以捕捉单词大写字母的任何情况。

如果要将字典存储到文件中,请使用picklejson

import pickle
with open("count.pkl", "wb") as f:
    pickle.dump(dct ,f)

简单加载:

with open("count.pkl", "rb") as f:
   dct = pickle.load(f)

在文件中使用json作为人类可读输出:

import json
with open("count.json", "w") as f:
    json.dump(dct,f)


with open("count.json") as f:
    dct = json.load(f)