Question

我目前正在使用Python 3.4。当我调用nltk.text时，我收到以下错误：

Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    nltk.text(tokens)
TypeError: 'module' object is not callable

有人可以帮帮我吗？

Answer 1

正如@jonrsharpe所说，nltk.text是一个模块，而不是一个类或一个函数。

>>> import nltk
>>> nltk.text
<module 'nltk.text' from '/usr/local/lib/python3.4/dist-packages/nltk/text.py'>

但是在

中有一个课程调用Text

>>> nltk.text.Text
<class 'nltk.text.Text'>

然而，您可以跳过命名空间，因为它未被发现＆＃34;到nltk超类：

>>> import nltk
>>> nltk.Text
<class 'nltk.text.Text'>

它是一个读取令牌（即字符串列表）的类对象，并将它们处理成＆＃34; nltk-able＆＃34;语料库。 E.g。

>>> from nltk import word_tokenize
>>> string = "This is a small foobar corpus, with foobar sentence. Hello World, yes it's a foobar day"
>>> mycorpus = nltk.Text(word_tokenize(string))>>> mycorpus
<Text: This is a small foobar corpus , with...>

使用Text对象，您可以执行多个语料库分析：

>> mycorpus
<Text: This is a small foobar corpus , with...>
>>> mycorpus.count('foobar')
3
>>> mycorpus.concordance('foobar')
Displaying 3 of 3 matches:
                                    foobar corpus , with foobar sentence . Hel
                                    foobar sentence . Hello World , yes it 's 
                                    foobar day
>>> mycorpus.index('foobar') # i.e. first instance of 'foobar'
4
>>> mycorpus.vocab()
FreqDist({'foobar': 3, ',': 2, 'a': 2, 'Hello': 1, 'small': 1, 'day': 1, 'is': 1, 'yes': 1, 'corpus': 1, 'This': 1, ...})

有关nltk.text.Text工作原理的详细信息，请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L262和

>>> dir(mycorpus)
['_CONTEXT_RE', '_COPY_TOKENS', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_concordance_index', '_context', '_vocab', 'collocations', 'common_contexts', 'concordance', 'count', 'dispersion_plot', 'findall', 'index', 'name', 'plot', 'readability', 'similar', 'tokens', 'unicode_repr', 'vocab']

另请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L575并尝试：

>>> nltk.text.demo()

nltk.text（）抛出错误

1 个答案: