我从["ONE","TWO","THREE","FOUR"]
等单词列表开始。
稍后,我加入列表以创建一个字符串:"ONETWOTHREEFOUR"
。我在查看这个字符串时会做一些事情并得到一个索引列表,比如说[6,7,8,0,4]
(它映射到那个字符串上给我“THROW”这个词,尽管正如评论中指出的那样与我的问题无关)。
现在我想知道原始列表中的哪些项目给了我用来表达我的信件。我知道我使用了加入字符串中的字母[6,7,8,0,4]
。根据字符串索引列表,我想要输出{0,1,2}
,因为我使用了原始列表中除"FOUR"
之外的每个单词的字母。
到目前为止我尝试过:
wordlist = ["ONE","TWO","THREE","FOUR"]
stringpositions = [6,7,8,0,4]
wordlengths = tuple(len(w) for w in wordlist) #->(3, 3, 5, 4)
wordstarts = tuple(sum(wordlengths[:i]) for i in range(len(wordlengths))) #->(0, 3, 6, 11)
words_used = set()
for pos in stringpositions:
prev = 0
for wordnumber,wordstart in enumerate(wordstarts):
if pos < wordstart:
words_used.add(prev)
break
prev = wordnumber
看起来非常啰嗦。对我来说,最好的(和/或大多数Pythonic)方法是什么?
答案 0 :(得分:1)
这是最简单的方法。如果您想要更节省空间,可能需要使用某种二叉搜索树
wordlist = ["ONE","TWO","THREE","FOUR"]
top = 0
inds = {}
for i,word in enumerate(wordlist):
for k in range(top, top+len(word)):
inds[k] = i
top += len(word)
#do some magic
L = [6,7,8,0,4]
for i in L: print(inds[i])
输出:
2
2
2
0
1
如果你想
,你当然可以在输出上调用set()
答案 1 :(得分:1)
如澄清in the comments,OP的目标是根据使用的字符串位置来确定使用哪些单词,而不是使用哪些字母 - 所以单词/ substring THROW
基本上是不相关的。
这是一个非常简短的版本:
from itertools import chain
wordlist = ["ONE","TWO","THREE","FOUR"]
string = ''.join(wordlist) # "ONETWOTHREEFOUR"
stringpositions = [6,7,8,0,4]
# construct a list that maps every position in string to a single source word
which_word = list(chain( [ii]*len(w) for ii, w in enumerate(wordlist) ))
# it's now trivial to use which_word to construct the set of words
# represented in the list stringpositions
words_used = set( which_word[pos] for pos in stringpositions )
print "which_word=", which_word
print "words_used=", words_used
==&GT;
which_word= [0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3]
words_used= set([0, 1, 2])
编辑:更新为使用list(itertools.chain(generator))
而不是sum(generator, [])
,正如评论中@ inspectorG4dget所建议的那样。