Question

我正在使用由NLTK PlaintextCorpusReader生成的两个列表，我想将它们组合成一个字典。

字典的键应该是语料库中的句子，我使用PlaintextCorpusReader＆＃39; s .sents()提取。值应该是每个句子在语料库中的位置的文件ID，我使用.fileids()提取。

.fileids()以字符串形式返回，例如

['R_v_Cole_2007.txt', 'R_v_Sellick_2005.txt']

.sents()返回list(list(str))，例如

[[u'1', u'.'], [u'The', u'Registrar', u'has', u'referred', u'to', u'this', u'Court', u'two', u'applications', u'for', u'permission', u'to', u'appeal', u'against', u'conviction', u'to', u'be', u'heard', u'together', u'.'], ...]

我已尝试过一系列事情，主要来自this question类似问题，但我尝试的所有内容都会导致以下错误：

TypeError: unhashable type: 'list'

我哪里错了？

我正在使用的代码来获取字典所需的内容如下：

corpus_root = '/Users/danielhoadley/Documents/Python/NLTK/text/'
wordlists = PlaintextCorpusReader(corpus_root, '.*')

dictionary = {}

values = wordlists.fileids()
keys = wordlists.sents()

## How do I get the keys and values into a dictionary from here?

将NLTK中生成的两个列表映射到字典

0 个答案: