从github repo设置nltk数据

时间:2018-03-06 10:48:15

标签: python nltk nltk-book

我关注nltk book chapter 1。我可以安装nltkimport nltk)但无法通过运行nltk.download()下载图书语料库。它给了我getattrinfo failed错误。因此,我开始快速浏览章节中的命令而不执行这些命令,因为大多数示例都需要书籍语料库。

但现在我想尝试FreqDist example

While running FreqDist, I realized that I have not done from nltk.book import *。所以,我再次尝试安装图书语料库。现在,我非常精疲力尽地尝试各种帖子中给出的不同解决方案来修复导入nltk数据时发生的getattrinfo failed错误。 (我tried setting up corporate proxychanging nltk downloader source link以及许多其他内容)

因此,我没有遵循使用nltk下载器的路线,而是尝试做一些可能的奇怪事情。我下载了zip from https://github.com/nltk/ ,已提取,然后在其中运行setup.py

现在,当我运行from nltk.book import *时,我得到以下输出:

>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
Traceback (most recent call last):
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\corpus\util.py", line 63, in __load
    try: root = nltk.data.find('corpora/%s' % zip_name)
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource 'corpora/gutenberg.zip/gutenberg/' not found.  Please
  use the NLTK Downloader to obtain the resource:  >>>
  nltk.download()
  Searched in:
    - 'C:\\Users\\593932/nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - 'D:\\path\\Softwares\\python\\WinPython-64bit-3.4.4.4Qt5\\python-3.4.4.amd64\\nltk_data'
    - 'D:\\path\\Softwares\\python\\WinPython-64bit-3.4.4.4Qt5\\python-3.4.4.amd64\\lib\\nltk_data'
    - 'C:\\Users\\593932\\AppData\\Roaming\\nltk_data'
**********************************************************************

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\book.py", line 20, in <module>
    text1 = Text(gutenberg.words('melville-moby_dick.txt'))
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\corpus\util.py", line 99, in __getattr__
    self.__load()
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\corpus\util.py", line 64, in __load
    except LookupError: raise e
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\corpus\util.py", line 61, in __load
    root = nltk.data.find('corpora/%s' % self.__name)
  File "D:\path\Softwares\python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\nltk\data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource 'corpora/gutenberg' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
    - 'C:\\Users\\593932/nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - 'D:\\path\\Softwares\\python\\WinPython-64bit-3.4.4.4Qt5\\python-3.4.4.amd64\\nltk_data'
    - 'D:\\path\\Softwares\\python\\WinPython-64bit-3.4.4.4Qt5\\python-3.4.4.amd64\\lib\\nltk_data'
    - 'C:\\Users\\593932\\AppData\\Roaming\\nltk_data'
**********************************************************************
>>>

我手动复制粘贴了上面列出的各种文件夹中包含nltk的{​​{1}}文件夹:

book.py

但没用。如何从github下载的zip中获取从我的解释器环境中导入的这本书,而不需要使用nltk下载器?它甚至可能吗?

0 个答案:

没有答案