Question

我有一个文本文件，我想在字典中将文本文件中的每个单词都放在字典中，然后每当单词在文本文件中时打印出索引位置。我拥有的代码仅给了我单词在文本文件中的次数。我该如何更改？我已经转换为小写字母。

dicti = {}  

for eachword in wordsintxt:
    freq = dicti.get(eachword, None)
    if freq == None:
        dicti[eachword] = 1
    else:
        dicti[eachword] = freq + 1

print(dicti)

Answer 1

更改代码以保留索引本身，而不仅仅是计数：

for index, eachword in enumerate(wordsintxt):
    freq = dicti.get(eachword, None)
    if freq == None:
        dicti[eachword] = []
    else:
        dicti[eachword].append(index)

如果您仍然需要频率一词：这很容易恢复：

freq = len(dicti[word])

每个OP评论的更新

没有enumerate，只需自己提供该功能：

for index in range(len(wordsintxt)):
    eachword = wordsintxt[i]

我不确定您为什么要这么做；该操作非常惯用且足够普遍，以至于Python开发人员正是出于这一目的而创建了enumerate。

Answer 2

您可以使用此：

wordsintxt = ["hello", "world", "the", "a", "Hello", "my", "name", "is", "the"]
words_data = {}

for i, word in enumerate(wordsintxt):
    word = word.lower()
    words_data[word] = words_data.get(word, {'freq': 0, 'indexes': []})
    words_data[word]['freq'] += 1
    words_data[word]['indexes'].append(i)


for k, v in words_data.items():
    print(k, '\t', v)

哪些印刷品：

hello    {'freq': 2, 'indexes': [0, 4]}
world    {'freq': 1, 'indexes': [1]}
the      {'freq': 2, 'indexes': [2, 8]}
a        {'freq': 1, 'indexes': [3]}
my       {'freq': 1, 'indexes': [5]}
name     {'freq': 1, 'indexes': [6]}
is       {'freq': 1, 'indexes': [7]}

您可以避免仅通过使用data[key] = data.get(key, STARTING_VALUE)

来检查字典中是否存在该值，然后执行自定义操作

问候！

Answer 3

将collections.defaultdict与enumerate一起使用，只需附加从枚举中检索到的所有索引

from collections import defaultdict

with open('test.txt') as f:
    content = f.read()

words = content.split()
dd = defaultdict(list)

for i, v in enumerate(words):
    dd[v.lower()].append(i)

print(dd)
# defaultdict(<class 'list'>, {'i': [0, 6, 35, 54, 57], 'have': [1, 36, 58],... 'lowercase.': [62]})

字典中的索引词

3 个答案: