如何从Python中的字符串中获取每个可能的子串列表?

时间:2018-02-11 17:33:10

标签: python string list if-statement substring

我正在尝试将给定字符串(“hello”)转换为包含每个子字符串列表的列表。例如:

[["hello"],["h,"ello"],["he","llo"],["hel","lo"],["hell","o"],\
["h","e","llo"],["h","e","l","lo"],["h","e","l","l","o],["he","l,"lo"],\
["hel","l,"o"],["hell","o]....etc....].

我理解最快的方法应该是递归函数,但我无法做到正确。 类似于:

x = "hello"
wordset=[]
string_div(0,x,wordset)
...
...
def string_div(i,word,wordset)
  wordset.append(wordset+[word[i:])
  ......
  
(这不像之前发布的其他问题,因为我只想要连接时形成相同原始单词的子串列表) 帮助将不胜感激! 感谢

1 个答案:

答案 0 :(得分:0)

我认为这不是严格意义上的重复,我会为您的问题提供准确的解决方案。

解决方案

对于长度为n的给定字符串,我们将为每个长度为(n-1)的二进制字符串获取字符串的唯一且有效的分区。例如:字符串"椰子"和二进制字符串" 001010"对应于分区:[' coc',' on',' ut']和二进制字符串" 100101"对应于:[' c',' oco',' nu'' t']。

因此,我们可以根据需要获取完整的分区列表,方法是将所有((2 ^(n-1)) - 1)不同的分区对应于不同的二进制序列。

实施

import itertools
def get_list_partitions(string):
  partitions = []

  #list of binary sequences
  binary_sequences = ["".join(seq) for seq in itertools.product("01", repeat=len(string)-1)]

  #go over every binary sequence (which represents a partition)
  for sequence in binary_sequences:
    partition = []

    #current substring, accumulates letters until it encounters "1" in the binary sequence
    curr_substring = string[0]
    for i, bit in enumerate(sequence):
      #if 0, don't partition. otherwise, add curr_substring to the current partition and set curr_substring to be the next letter
      if bit == '0':                        
        curr_substring = curr_substring + string[i+1]
      else:
        partition.append(curr_substring)
        curr_substring = string[i+1]  

    #add the last substring to the partition
    partition.append(curr_substring)

    #add partition to the list of partitions
    partitions.append(partition)  

  return partitions 


print(get_list_partitions("coconut"))