Question

我制作了一个语料库onTouch()。而且我无法在python中上传它

我面临的问题：

1）我应该将自建语料库放在所有预建语料库所在的位置吗？

1.a）如果是这样，为什么我无法使用此命令:(假设位置为'LOCATION'）

abc

1.b）事实上，

abc = nltk.data.find('LOCATION\abc')

抛出此错误

 from nltk import abc

2）我可以上传我创建的语料库的其他方法是什么？

Answer 1

我认为您正在寻找this other question的第一个或第二个答案。

无论如何，这是一种快速的方法：

import nltk
from nltk.corpus import PlaintextCorpusReader

corpus_root = './'
newcorpus = PlaintextCorpusReader(corpus_root, '.*') # Files you want to add
newcorpus.words('file-1.txt')

不，将自己的语料库放在nltk的数据目录中似乎并不是一个好主意。不是出于特殊原因，只是为了使您的数据与工具包中包含的数据分开。

我如何使用我在python中创建的语料库？

1 个答案: