我有一个字符串teststring
和一个子字符串列表s
,但是teststring
意外地被分割了。现在,我想知道列表中的索引,如果将这些索引放在一起,将重新创建teststring
。
teststring = "Hi this is a test!"
s = ["Hi", "this is", "Hello,", "Hi", "this is", "a test!", "How are", "you?"]
预期输出为(构成s
的列表teststring
中需要连续出现的字符串-> [0,4,5]
是错误的):
[3,4,5]
任何人都知道该怎么做吗?
我尝试提出了一个不错的解决方案,但没有发现任何可行的方法...
我只记录每个实例,teststring
的一部分出现在s
的子字符串之一中:
test_list = []
for si in s:
if si in teststring:
flag = True
else:
flag = False
test_list.append(flag)
然后您将获得:[True, True, False, True, True, True, False, False]
...然后必须获取最长的连续“ True”的索引。 Anayone知道如何获取这些索引?
答案 0 :(得分:1)
这有点令人费解,但是可以完成工作:
start_index = ' '.join(s).index(teststring)
s_len = 0
t_len = 0
indices = []
found = False
for i, sub in enumerate(s):
s_len += len(sub) + 1 # To account for the space
if s_len > start_index:
found = True
if found:
t_len += len(sub)
if t_len > len(teststring):
break
indices.append(i)
答案 1 :(得分:1)
如果您想要的是连接时形成字符串的连续索引列表,那么我认为这将满足您的需求:
teststring = "Hi this is a test!"
s = ["Hi", "this is", "Hello,", "Hi", "this is", "a test!", "How are", "you?"]
test_list = []
i = 0 # the index of the current element si
for si in s:
if si in teststring:
# add the index to the list
test_list.append(i)
# check to see if the concatenation of the elements at these
# indices form the string. if so, this is the list we want, so exit the loop
if ' '.join(str(s[t]) for t in test_list) == teststring:
break
else:
# if we've hit a substring not in our teststring, clear the list because
# we only want consecutive indices
test_list = []
i += 1
答案 2 :(得分:1)
将列表加入大字符串中,在大字符串中找到目标字符串,然后通过检查列表中每个字符串的长度来确定起始索引和结束索引。
>>> teststring = "Hi this is a test!"
>>> s = ["Hi", "this is", "Hello,", "Hi", "this is", "a test!", "How are", "you?"]
>>> joined = ' '.join(s)
>>> index = joined.index(teststring)
>>> lengths = list(map(len, s))
>>> loc = 0
>>> for start,ln in enumerate(lengths):
... if loc == index:
... break
... loc += ln + 1
...
>>> dist = 0
>>> for end,ln in enumerate(lengths, start=start):
... if dist == len(teststring):
... break
... dist += ln + 1
...
>>> list(range(start, end))
[3, 4, 5]
答案 3 :(得分:1)
这是我要解决的问题,希望对您有帮助:
def rebuild_string(teststring, s):
for i in range(len(s)): # loop through our whole list
if s[i] in teststring:
index_list = [i] # reset each time
temp_string = teststring
temp_string = temp_string.replace(s[i], "").strip()
while i < len(s) - 1: # loop until end of list for each run through for loop
if len(temp_string) == 0: # we've eliminated all characters
return index_list # all matches are found, so we'll break all our loops and exit
i += 1 # we need to manually increment i inside while loop, but reuse variable because we need initial i from for loop
if s[i] in temp_string: # the next item in list is also in our string
index_list.append(i)
temp_string = temp_string.replace(s[i], "").strip()
else:
break # go back to for loop and try again
return None # no match exists in the list
my_test = "Hi this is a test!"
list_of_strings = ["Hi", "this is", "Hello,", "Hi", "this is", "a test!", "How are", "you?"]
print(rebuild_string(my_test, list_of_strings))
结果:
[3, 4, 5]
基本上,我只是发现列表项在主字符串中的位置,然后字符串中也必须存在下一个连续的列表项,直到没有匹配的内容为止(沿途将空格分隔开)。这也将匹配不按顺序放入列表中的字符串,只要将它们组合在一起就可以重新创建整个字符串。不确定这是否是您要的...