我目前正在使用Python 3.4。当我调用nltk.text
时,我收到以下错误:
Traceback (most recent call last):
File "<pyshell#20>", line 1, in <module>
nltk.text(tokens)
TypeError: 'module' object is not callable
有人可以帮帮我吗?
答案 0 :(得分:0)
正如@jonrsharpe所说,nltk.text
是一个模块,而不是一个类或一个函数。
>>> import nltk
>>> nltk.text
<module 'nltk.text' from '/usr/local/lib/python3.4/dist-packages/nltk/text.py'>
但是在
中有一个课程调用Text
>>> nltk.text.Text
<class 'nltk.text.Text'>
然而,您可以跳过命名空间,因为它未被发现&#34;到nltk超类:
>>> import nltk
>>> nltk.Text
<class 'nltk.text.Text'>
它是一个读取令牌(即字符串列表)的类对象,并将它们处理成&#34; nltk-able&#34;语料库。 E.g。
>>> from nltk import word_tokenize
>>> string = "This is a small foobar corpus, with foobar sentence. Hello World, yes it's a foobar day"
>>> mycorpus = nltk.Text(word_tokenize(string))>>> mycorpus
<Text: This is a small foobar corpus , with...>
使用Text
对象,您可以执行多个语料库分析:
>> mycorpus
<Text: This is a small foobar corpus , with...>
>>> mycorpus.count('foobar')
3
>>> mycorpus.concordance('foobar')
Displaying 3 of 3 matches:
foobar corpus , with foobar sentence . Hel
foobar sentence . Hello World , yes it 's
foobar day
>>> mycorpus.index('foobar') # i.e. first instance of 'foobar'
4
>>> mycorpus.vocab()
FreqDist({'foobar': 3, ',': 2, 'a': 2, 'Hello': 1, 'small': 1, 'day': 1, 'is': 1, 'yes': 1, 'corpus': 1, 'This': 1, ...})
有关nltk.text.Text
工作原理的详细信息,请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L262和
>>> dir(mycorpus)
['_CONTEXT_RE', '_COPY_TOKENS', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_concordance_index', '_context', '_vocab', 'collocations', 'common_contexts', 'concordance', 'count', 'dispersion_plot', 'findall', 'index', 'name', 'plot', 'readability', 'similar', 'tokens', 'unicode_repr', 'vocab']
另请参阅https://github.com/nltk/nltk/blob/develop/nltk/text.py#L575并尝试:
>>> nltk.text.demo()