我正在尝试使用nltk标记句子。当我通过python shell执行它时,我得到了正确的答案。
>>> import nltk
>>> sentence = "Mohanlal made his acting debut in Thiranottam (1978), but the film got released only after 25 years due to censorship issues."
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['Mohanlal', 'made', 'his', 'acting', 'debut', 'in', 'Thiranottam', '(', '1978', ')', ',', 'but', 'the', 'film', 'got', 'released', 'only', 'after', '25', 'years', 'due', 'to', 'censorship', 'issues', '.']
但是当我在文件中编写相同的代码并尝试运行它时,我收到以下错误。
Traceback (most recent call last):
File "tokenize.py", line 1, in <module>
import nltk
File "/usr/local/lib/python2.7/dist-packages/nltk/__init__.py", line 114, in <module>
from nltk.collocations import *
File "/usr/local/lib/python2.7/dist-packages/nltk/collocations.py", line 38, in <module>
from nltk.util import ngrams
File "/usr/local/lib/python2.7/dist-packages/nltk/util.py", line 13, in <module>
import pydoc
File "/usr/lib/python2.7/pydoc.py", line 55, in <module>
import sys, imp, os, re, types, inspect, __builtin__, pkgutil, warnings
File "/usr/lib/python2.7/inspect.py", line 39, in <module>
import tokenize
File "/home/gadheyan/Project/Codes/tokenize.py", line 2, in <module>
from nltk import word_tokenize
ImportError: cannot import name word_tokenize
这是我运行的代码。
import nltk
from nltk import word_tokenize
sentence = "Mohanlal made his acting debut in Thiranottam (1978), but the film got released only after 25 years due to censorship issues."
tokens = nltk.word_tokenize(sentence)
print tokens
答案 0 :(得分:2)
<强> TL; DR 强>
这是一个命名问题,请参阅Python failed to `import nltk` in my script but works in the interpreter
将您的文件重命名为my_tokenize.py
而不是tokenize.py
,即
$ mv /home/gadheyan/Project/Codes/tokenize.py /home/gadheyan/Project/Codes/my_tokenize.py
$ python my_tokenize.py
长期:
从您的追溯中,您会看到:
File "/usr/lib/python2.7/inspect.py", line 39, in <module>
import tokenize
File "/home/gadheyan/Project/Codes/tokenize.py", line 2, in <module>
from nltk import word_tokenize
在NLTK中,nltk.tokenize
位于nltk.word_tokenize
所在的包裹调用tokenize.py
,http://www.nltk.org/_modules/nltk/tokenize.html
因此,如果您的脚本名称为nltk.word_tokenize
,并且当您调用nltk.tokenize
并且当它进入nltk并尝试导入/home/gadheyan/Project/Codes/tokenize.py
时,它会导入您的脚本({{1因为inspect.py
使用本地namespaces
nltk.tokenize
<强>顺便说一句强>
冗余命名空间仍然可以在python中运行,但最好保持名称空间和全局变量清洁,即使用它:
alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk import word_tokenize
>>> sent = 'this is a foo bar sentence'
>>> word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()
或者这个:
alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> sent = 'this is a foo bar sentence'
>>> nltk.word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()
但是尽量避免这种情况(尽管它仍然有效):
alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> from nltk import word_tokenize
>>> sent = 'this is a foo bar sentence'
>>> word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> nltk.word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()