Question

我有两个列表： a 和 b 。

a 是包含三个或更多字符串的列表，而 b 是分隔符的列表。

我需要生成 a 的所有可能组合，然后将结果与 b 的所有可能组合“合并”（请参见示例以更好地理解）。< / p>

我最终使用了以下代码：

from itertools import permutations, combinations, product

a = ["filename", "timestamp", "custom"]
b = ["_", "-", ".", ""]

output = []

for com in combinations(b, len(a) - 1):
    for per in product(com, repeat=len(a) - 1):
        for ear_per in permutations(a):
            out = ''.join(map(''.join, zip(list(ear_per[:-1]), per))) + list(ear_per)[-1]
            output.append(out)

# For some reason the algorithm is generating duplicates
output = list(dict.fromkeys(output))

for o in output:
    print o

这是输出的样本（正确，在这种情况下正是我所需要的）：

timestamp.customfilename
filenamecustom.timestamp
custom_filenametimestamp
timestamp_custom_filename
timestamp-filename.custom
custom_filename-timestamp
filename.timestamp-custom
. . .
filename.custom.timestamp
filename-customtimestamp
custom-timestamp_filename
filename_custom-timestamp
filename.timestampcustom
timestampcustom-filename
custom-timestamp.filename
filenamecustom_timestamp
timestamp.custom_filename
custom.timestampfilename
timestampfilename.custom
customfilename_timestamp
filenametimestamp-custom
custom-filenametimestamp
timestampfilename-custom
timestamp-custom-filename
custom.filenametimestamp
customfilenametimestamp
timestampfilename_custom
custom_filename.timestamp
custom-timestamp-filename
custom-timestampfilename
filename_timestamp.custom
. . .
filename.custom-timestamp
timestamp_filenamecustom
custom_timestampfilename
timestamp.custom.filename
timestamp.filename-custom
filename-custom-timestamp
customfilename.timestamp
filename_timestamp_custom
timestamp_filename.custom
customtimestampfilename
filenamecustomtimestamp
custom.timestamp_filename
filename_customtimestamp
. . .
timestamp-customfilename
filename_custom.timestamp

此算法有两个主要问题：

它会生成一些重复的行，因此我总是需要删除它们（在更大的数据集上速度很慢）
if len(a) > len(b) + 2脚本无法启动。在这种情况下，我需要重复分隔符以覆盖 a 中包含的单词之间的len(a) - 1可用空间。

Answer 1

这可能是一种解决方案。它需要与(3*2 = 6)的{{1}}交错的a product的排列，以获得总共(2 at a time here, 4*4 == 16)个结果。

6 * 16 == 96

Answer 2

您可能正在寻找这个：

a = ["filename", "timestamp", "custom"]
b = ["_", "-", ".", ""]
count = 0

def print_sequence(sol_words, sol_seps):
  global count 
  print("".join([sol_words[i] + sep for (i, sep) in enumerate(sol_seps)] + [sol_words[-1]]))
  count += 1

def backtrack_seps(sol_words, seps, sol_seps, i):
  for (si, sep) in enumerate(seps):
    sol_seps[i] = sep

    if i == len(sol_words) - 2:
      print_sequence(sol_words, sol_seps)
    else:
      backtrack_seps(sol_words, seps, sol_seps, i + 1)

def bt_for_sep(sol_words, seps):
  backtrack_seps(sol_words, seps, [''] * (len(sol_words) - 1), 0)

def backtrack_words(active, words, seps, sol_words, i):
  for (wi, word) in enumerate(words):
    if active[wi]:
      sol_words[i] = word
      active[wi] = False

      if i == len(words) - 1:
        bt_for_sep(sol_words, seps)
      else:
        backtrack_words(active, words, seps, sol_words, i + 1)

      active[wi] = True

backtrack_words([True] * len(a), set(a), set(b), [''] * len(a), 0)
print(count) #96

通常，当您需要枚举一组特定值的所有可能性时，可以使用回溯。回溯的方案始终是相同的，并且在使用分隔符来置换单词后，对分隔符重复该方案。

编辑

问题的第二部分，描述为查找分隔符的组合，实际上是查找所有具有重复的处置的问题。这样做比我想的要简单：在seps中选择一个分隔符后，不必删除它（在这种情况下也要禁用它），而只需将其保留。

Itertools-合并两个列表以获取所有可能的组合

2 个答案: