如何以不规则的间隔将字符串拆分为列表

时间:2019-04-11 07:17:59

标签: python

我正在尝试使用间隔列表来拆分字符串,在与字符索引对应的第一个间隔值之前插入一个空格,在与字符索引对应的第二个间隔值之后插入一个空格。 / p>

我知道如何按固定间隔分割字符串:

string = 'anexample'
result = []
for i in range(0, len(string), 2):
    result.append(' ')
    result.append(line[i:i+2])
result = [' ','an',' ','ex',' ','am',' ','pl',' ','e']

但是我不确定如何使用这样的间隔列表进行处理:

string = 'anexample'
result = []
interval_list = [[0,0],[2,5]]

并得到以下结果:

result = [' ','a',' ','n',' ','exam',' ','ple']

谢谢您的任何帮助。

编辑:interval_list是通过将列表中的字符与字符串进行比较而得出的,例如:

string = 'anexample'
word_list = ['exam']
interval_list = [[2,5]]

其中string [2] ='e'和string [5] ='m'。在'e'之前和'm'之后添加空格将给出:

result = ['an',' ','exam',' ','ple']

4 个答案:

答案 0 :(得分:0)

假设间隔始终按顺序列出:

string = 'anexample'
result = []
interval_list = [[0,0],[2,5]]

for i,interval in enumerate(interval_list):
    # append the part of the string before the first interval (if any)
    if i < 1 and interval[0] > 0:
        result.append(string[0:interval[0]])

    result.append(' ')
    result.append(string[interval[0]:interval[1]+1])
    result.append(' ')

    # append the part of the string before the next interval (if any)
    if i < len(interval_list) - 1 and (interval_list[i+1][0]>interval[1]+1):
        result.append(string[interval[1]+1:interval_list[i+1][0]])

    # append the rest of the string to result
    elif i == len(interval_list) - 1 and i < len(string)-1:
        result.append(string[interval[1]+1:len(string)])

print(result)

输出:

[' ', 'a', ' ', 'n', ' ', 'exam',' ', 'ple']

答案 1 :(得分:0)

恕我直言,您的interval_list令人不安。看起来应该像

lst = [0, 1, 2, 6, 9]

然后您只需完成

for a, b in zip(lst[:-1], lst[1:]):
    result.extend([' ', string[a:b]])

# print(result)
# [' ', 'a', ' ', 'n', ' ', 'exam', ' ', 'ple']

您已经完成。


如果您对interval_list的结构没有影响,则可以通过以下方式计算此更合适的列表:

lst = [i for sub in interval_list for i in sub]
for i in range(1, len(lst), 2):
    lst[i] += 1
lst += [len(string)]

# [0, 1, 2, 6, 9]

或者如果您可能已经导入了numpy

lst = np.array(interval_list).flatten()
lst[1::2] += 1
lst = np.append(lst, len(string))

答案 2 :(得分:0)

如果间隔列表从0开始,则可以使用以下代码:

string = 'anexample'
result = []
interval_list = [[0,0],[2,5]]

for i in range(len(interval_list)) :
    current_interval = interval_list[i]
    result.append(" ")
    result.append(string[current_interval[0]:current_interval[1]+1])
    result.append(" ")
    if i < len(interval_list) - 1 :
        next_interval = interval_list[i+1]
        result.append(string[current_interval[1]+1:next_interval[0]])
    if i == len(interval_list) - 1 :
        if string[current_interval[1]+1:] is not '' :
            result.append(string[current_interval[1]+1:])

output : [' ', 'a', ' ', 'n', ' ', 'exam', ' ', 'ple']

答案 3 :(得分:0)

string = 'anexample'
result = []
interval_list = [[0,0],[2,5]]


# Step 1:turn string into element list
string_list = list(string) #['a', 'n', 'e', 'x', 'a', 'm', 'p', 'l', 'e']

# Step 2: we want to insert " " according to the interval_list, but each time we insert one element, the next index
#will be influenced. So we convert the index into a new_interval_list that can predict the possible change
new_interval_list =  [[None for _ in range(2)] for _ in range(2)] # it should have identical shape to the interval_list

for i in range(len(interval_list)):
    #print(str(i))
    for j in range(2):
        #print(str(j))
        if j == 0: # first interval in a list
            new_interval_list[i][j] = interval_list[i][j] + i * 2
            #print(new_interval_list)
        else: # second interval in a list
            new_interval_list[i][j] = interval_list[i][j] + 2 + i * 2
            #print(new_interval_list)

# the new_interval_list returns [[0, 2], [4, 9]]

# Step 3: we turn [[0, 2], [4, 9]] into [0,2,4,9]
import itertools
new_interval_list = list(itertools.chain.from_iterable(new_interval_list))

# Step 4: now we can insert " " into the list
for item in new_interval_list:
    string_list.insert(item," ")
    # [' ', 'a', ' ', 'n', ' ', 'e', 'x', 'a', 'm', ' ', 'p', 'l', 'e']

# Step 5: to get ['a', 'n', 'exam', 'ple']

tem_l = ("".join(string_list)).split() # ['a', 'n', 'exam', 'ple']

# step 6: get result, add " " between each item in the tem_l
for i in range(4):
    result.append(" ")
    result.append(tem_l[i])


result