如何在python中找到重复单词的索引

时间:2017-09-14 06:24:24

标签: python indexing

我在python中编写这段代码

import re

text = input('please enter text: ')
word = re.findall('\w+', text)
len_word = len(word)
word_pos = []

for i in range(len_word):
    if text.index(word[i]) in word_pos:
        prev_index = text.index(word[i]) + 1
        last_index = 0
        # print('index1: ' , last_index)
        text = text[prev_index:]
        # print('new_text: ' , new_text)
        word_pos.append(text.index(word[i]) + prev_index + last_index)
        last_index += prev_index
    else:
        word_pos.append(text.index(word[i]))

print(word_pos)

和此输入的输出:a, 是:[0,2],是正确的, 但在这个siguation:a a a, 答案是:[0,2,1], 我想看看:[0,2,4], 我想要一个动态代码,因为我不知道何时从输入中得到duplacated字。 如果有任何解决方案,我想得到更多重复的单词索引 感谢

2 个答案:

答案 0 :(得分:2)

你可以这样做:

import re

text = input('please enter text: ')
words = re.findall('\w+', text)
word_pos = []
pos = 0 # this will help us track the word's position in the original text

for i in range(len(words)):
    word = words[i]
    pos += text[pos:].index(word) # we use the position of the last word to find the position of the current word
    if word in words[i+1:] or word in words[:i]: # we have a duplicate so we can append this position
        word_pos.append(pos) 
        print('{} found at {} in text'.format(word,pos))
    pos += 1

输入:"a a a a",我得到结果:

please enter text: a a a a a
a found at 0 in text
a found at 2 in text
a found at 4 in text
a found at 6 in text
a found at 8 in text

答案 1 :(得分:0)

import re
text = input('please enter text: ')
print({word: [w.start() for w in re.finditer(word, text)] for word in text.split()})

输入1:

please enter text: a a a a

output:
{'a': [0, 2, 4, 6]}

input 2:
please enter text: Hello Arun Hello man

output:
{'Arun': [6], 'Hello': [0, 11], 'man': [17]}