Question

我从书籍编程集体智慧第118页，“文档过滤”一章中复制了以下代码。此功能通过将文本分隔为非字母的任何字符，将文本分解为单词。这只留下实际的单词，全部转换为小写。

import re                                          
import math
def getwords(doc):
    splitter=re.compile('\\W*')
    words=[s.lower() for s in splitter.split(doc) 
           if len(s)>2 and len(s)<20]
    return dict([(w,1) for w in words])

我实现了该功能并出现以下错误：

>>> import docclas
>>> t=docclass.getwords(s)
Traceback (most recent call last):
  File "<pyshell#15>", line 1, in <module>
    t=docclass.getwords(s)
  File "docclass.py", line 6, in getwords
    words=[s.lower() for s in splitter.split(doc)
NameError: global name 'splitter' is not defined

Answer 1

它可以在这里工作

>>> import re
>>> 
>>> def getwords(doc):
...     splitter=re.compile('\\W*')
...     words=[s.lower() for s in splitter.split(doc) 
...            if len(s)>2 and len(s)<20]
...     return dict([(w,1) for w in words])
... 
>>> getwords ("He's fallen in the water!");
{'water': 1, 'the': 1, 'fallen': 1}

猜猜你在代码中写了一个拼写错误，但是当你把它粘贴在这里时就把它弄好了。

以下代码有什么问题？

1 个答案: