nltk.text()抛出错误

时间:2015-04-08 12:34:47

标签: python python-3.x nlp nltk corpus

我目前正在使用Python 3.4。当我调用nltk.text时,我收到以下错误:

Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    nltk.text(tokens)
TypeError: 'module' object is not callable

有人可以帮帮我吗?

1 个答案:

答案 0 :(得分:0)

正如@jonrsharpe所说,nltk.text是一个模块,而不是一个类或一个函数。

>>> import nltk
>>> nltk.text
<module 'nltk.text' from '/usr/local/lib/python3.4/dist-packages/nltk/text.py'>

但是在

中有一个课程调用Text
>>> nltk.text.Text
<class 'nltk.text.Text'>

然而,您可以跳过命名空间,因为它未被发现&#34;到nltk超类:

>>> import nltk
>>> nltk.Text
<class 'nltk.text.Text'>

它是一个读取令牌(即字符串列表)的类对象,并将它们处理成&#34; nltk-able&#34;语料库。 E.g。

>>> from nltk import word_tokenize
>>> string = "This is a small foobar corpus, with foobar sentence. Hello World, yes it's a foobar day"
>>> mycorpus = nltk.Text(word_tokenize(string))>>> mycorpus
<Text: This is a small foobar corpus , with...>

使用Text对象,您可以执行多个语料库分析:

>> mycorpus
<Text: This is a small foobar corpus , with...>
>>> mycorpus.count('foobar')
3
>>> mycorpus.concordance('foobar')
Displaying 3 of 3 matches:
                                    foobar corpus , with foobar sentence . Hel
                                    foobar sentence . Hello World , yes it 's 
                                    foobar day
>>> mycorpus.index('foobar') # i.e. first instance of 'foobar'
4
>>> mycorpus.vocab()
FreqDist({'foobar': 3, ',': 2, 'a': 2, 'Hello': 1, 'small': 1, 'day': 1, 'is': 1, 'yes': 1, 'corpus': 1, 'This': 1, ...})

有关nltk.text.Text工作原理的详细信息,请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L262

>>> dir(mycorpus)
['_CONTEXT_RE', '_COPY_TOKENS', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_concordance_index', '_context', '_vocab', 'collocations', 'common_contexts', 'concordance', 'count', 'dispersion_plot', 'findall', 'index', 'name', 'plot', 'readability', 'similar', 'tokens', 'unicode_repr', 'vocab']

另请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L575并尝试:

>>> nltk.text.demo()