使用nltk python

时间:2016-04-05 17:11:30

标签: python nltk tagging

HeyGuys我正在尝试使用nltk库进行python,我遇到了一个错误,我对pos标记功能一无所知。我在Windows命令窗口中运行下面的代码。代码运行到文本被标记的行:
        PosTokens = [令牌中的e [pos_tag(e)]

from nltk import *

def main():

   text = "Hello my name is Bob. I am 12 years old."
   sentences = tokenize.sent_tokenize(text)
   print(sentences)
   tokens = [tokenize.word_tokenize(s) for s in sentences]
   print(tokens)
   PosTokens = [pos_tag(e) for e in tokens]

   return;

if __name__ == "__main__":
main()

我得到以下作为输出

['Hello my name is Bob.', 'I am 12 years old.']
[['Hello', 'my', 'name', 'is', 'Bob', '.'], ['I', 'am', '12', 'years', 'old', ' .']]
Traceback (most recent call last):
File "test.py", line 15, in <module>
main()
File "test.py", line 10, in main
PosTokens = [pos_tag(e) for e in tokens]
File "test.py", line 10, in <listcomp>
PosTokens = [pos_tag(e) for e in tokens]
File "C:\Python34\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
tagger = PerceptronTagger()
File "C:\Python34\lib\site-packages\nltk\tag\perceptron.py", line 141, in  __init__
self.load(AP_MODEL_LOC)
File "C:\Python34\lib\site-packages\nltk\tag\perceptron.py", line 209, in load
self.model.weights, self.tagdict, self.classes = load(loc)
File "C:\Python34\lib\site-packages\nltk\data.py", line 801, in load
opened_resource = _open(resource_url)
File "C:\Python34\lib\site-packages\nltk\data.py", line 924, in _open
return urlopen(resource_url)
File "C:\Python34\lib\urllib\request.py", line 153, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 455, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 478, in _open
'unknown_open', req)
File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1303, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: c>

正如您所看到的,我可以进行标记,但是在运行pos标记功能时出现错误。有谁知道如何解决这个错误?我正在运行python 3.4.0。谢谢你的所有答案。

0 个答案:

没有答案