Question

我正在计算TF-IDF但是对于IDF部分我遇到了一些错误。你能指导我吗？什么是错误

TypeError：获得至少1个参数，得到0

def computeIDF(docList):
    import math
    idfDict={}
    idfDict=dict.fromkeys(docList[0].get(),0)
    for doc in docList:
        for word, val in doc.items():
            if val > 0:
                idfDict[word]+=1
    for word, val in idfDict.items():
        idfDict[word]=math.log(3 / float(val))
    return idfDict

idfs1=computeIDF([DictA1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in computeIDF
TypeError: get expected at least 1 arguments, got 0

Answer 1

正如您在回溯中看到的那样，问题出在第4行，即

idfDict=dict.fromkeys(docList[0].get(),0)

您的docList变量属于dict类型，documentation我们看到：

get（key [，default]）

该方法希望您指定要获取的键。实际上，由于字典没有订购，因此很难建议任何默认密钥您正在尝试使用与docList[0]类似的密钥来获取字典，这是第一个文档，但您并不需要。将键new_key添加到python dict的常用方法是dict[new_key] = value：

>>> d = dict()
>>> d['foo'] = 0
>>> d
{'foo': 0}

但是当您尝试增加不存在的密钥时，您会得到KeyError。为了避免这种情况，当没有密钥时，你应该使用dict.get(new_key, 0)来获得0。

>>> d['bar'] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'bar'
>>> d['bar'] = d.get('bar', 0) + 1
>>> d
{'foo': 0, 'bar': 1}

另一种选择是在尝试增加密钥时捕获KeyError。

所以，优雅的解决方案是添加

idfDict[word] = idfDict.get(word, 0) + 1

在适当的地方并摆脱初始化线（第4行）。

在我回答你的问题之后，我应该提一些代码风格问题：

尝试切换到snake_case，如果您经常使用python：doc_list;
用空格分隔二元运算符（例如+，=，...）;
在逗号后加上一个空格：d.get('foo',␣0);
使用dict()代替{}：有些人可能认为它是一套

如果遵循该规则，python社区将使您的代码更具可读性。如果您对其他问题感兴趣，请参阅PEP8。

干杯！

Answer 2

当你从字典中使用get方法时，你必须传递一些参数（索引或键值。）。

idfDict=dict.fromkeys(docList[0].get(),0)

也许你的意思是在没有得到的情况下写作？没有docList字典的结构，我们无法帮助。

idfDict=dict.fromkeys(docList[0],0)

Python3中的TF-IDF

2 个答案: