Question

我遇到了各种各样的逻辑错误，我似乎无法解决它。这就是我所拥有的：

Document = 'Sample1'
locationslist = []
thedictionary = []
userword = ['the', 'a']
filename = 'Sample1'
for inneritem in userword:
     thedictionary.append((inneritem,locationslist))
     for position, item in enumerate(file_contents): 
        if item == inneritem:
            locationslist.append(position)
wordlist = (thedictionary, Document)
print wordlist

所以基本上我试图从一个较小的列表（locationslist）和特定的用户词创建一个更大的列表（thedictionary）。我几乎拥有它，除了我有输出在每个列表中放置所有单词的所有位置（其中只有2 - 'the'和'a'）。似乎有一个简单的逻辑问题 - 但我似乎无法发现它。输出是：

([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224]), 
  ('a', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224])], 
 'Sample1')

但应该是：

([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161]), 
  ('a', [2, 49, 57, 131, 167, 189, 194, 207, 215, 224])], 
 'Sample1')

查看两个位置列表如何附加到每个用户词'the'和'a'的有问题输出中？我可以在这里使用有关我做错的建议..

Answer 1

您只创建一个locationslist，因此您只有一个locationslist。它由两个词共享。您需要在每次循环迭代中创建一个新的for inneritem in userword: locationslist = [] thedictionary.append((inneritem,locationslist)) # etc.：

{{1}}

Answer 2

您只创建了一个locationslist，因此所有locationslist.append()次调用都会修改该列表。您在locationslist中添加与thedictionary中的元素相同的userword元组。您应该为userword的每个元素创建一个位置列表。

您拥有的算法可以编写为嵌套的列表推导集，这将导致创建正确的列表：

user_word = ['the', 'a']
word_list = ([(uw, 
               [position for position, item in enumerate(file_contents) 
                if item == uw]) 
               for uw in user_word], 
             'Sample1')

对于enumerate(file_contents)中的每个项目，仍然会调用user_word一次，如果file_contents很大，这可能会很昂贵。

我建议您重写一次以传递file_contents一次，检查每个位置的项目是否与user_word的内容相对应，并将该位置仅附加到该位置找到的特定user_word的列表中。我建议使用字典来保持user_word列表的独立性和可访问性：

document = 'Sample1'

temp_dict = dict((uw, []) for uw in user_word)

for position, item in enumerate(file_contents):

if item in temp_dict:
    temp_dict[item].append(position)

wordlist = ([(uw, temp_dict[uw]) for uw in user_word], document)

任何一种解决方案都可以按照外观顺序在扫描的文档中获取每个user_word的位置。它还将返回您正在寻找的列表结构。

Python：列出附加问题

2 个答案: