Question

这是一个程序，它采用单词列表（文本）并将数字添加到数字列表（称为数字）以表示原始文本的索引，例如短语＆＃34;水手去了海海，看他能看到的东西看到了什么，但他能看到的所有东西看到的是深蓝色的大海海底＆＃34;应退回为＆＃34; 1 2 3 4 5 5 5 4 6 7 8 9 6 6 6 10 11 12 8 9 6 6 6 13 1 14 15 16 17 5 5 5＆＃34;然而返回为＆＃34; 1 2 3 4 5 5 5 4 9 10 11 12 9 9 9 13 14 15 11 12 9 9 9 16 1 17 18 1 19 20 5 5 5＆＃34;，导致问题。< / p>

这是问题的一部分：

for position, item in enumerate(text):
    if text.count(item) < 2:
        numbers.append(max(numbers) + 1)
    else:
        numbers.append(text.index(item) + 1)

＆＃34;数字＆＃34;和＆＃34;文字＆＃34;都是名单。

Answer 1

使用词典的解决方案：

text="the sailor went to sea sea sea to see what he could see see see but all that he could see see see was the bottom of the deep blue sea sea sea" 
l=text.split(' ')
d=dict()
cnt=0
for word in l :
    if word not in d : 
       cnt+=1
       d[word]=cnt 
out=[d[w] for w in l]  

#[1, 2, 3, 4, 5, 5, 5, 4, 6, 7, 8, 9, 6, 6, 6, 10, 11, 12, 8, 9, 6, 6, 6, 13, 1, 14, 15, 1, 16, 17, 5, 5, 5]

Answer 2

一个简单的解决方案是创建一个没有重复项的文本版本，但保持相同的顺序，并使用index()从该列表中查找原始文本中单词的索引：

通过按空格分割来创建字符串中的列表：

text="the sailor went to sea sea sea to see what he could see see see but all that he could see see see was the bottom of the deep blue sea sea sea"
listText=text.split(" ")

创建新列表，不包含重复项，其中包含文字中的所有字词，使用count()检查以前未显示的字词：

unique_text=[listText[x] for x in range(len(listText))if listText[:x].count(listText[x])<1]

使用list comprehension获取unique_text中listText中每个单词的索引（并添加1）：

positions=[unique_text.index(x)+1 for x in listText]

最终代码：

text="the sailor went to sea sea sea to see what he could see see see but all that he could see see see was the bottom of the deep blue sea sea sea"
listText=text.split(" ")
unique_text=[listText[x] for x in range(len(listText))if listText[:x].count(listText[x])<1]
positions=[unique_text.index(x)+1 for x in listText]

输出：

[1, 2, 3, 4, 5, 5, 5, 4, 6, 7, 8, 9, 6, 6, 6, 10, 11, 12, 8, 9, 6, 6, 6, 13, 1, 14, 15, 1, 16, 17, 5, 5, 5]

如何使添加到数字列表中的数字应用于单词列表的索引，而不会跳过任何数字

2 个答案: