Question

我有一个单词列表，有些单词可以用两个或多个其他单词组成，我必须返回所有这样的组合。

输入：

words = ["leetcode","leet","code","le","et","etcode","de","decode","deet"]

输出：

（“ leet”，“ code”）（“ le”，“ et”，“ code”）（“ de”，“ code”）等。

我尝试过的事情：

1）尝试所有可能的组合会花费太多时间，这不是一个好主意。

2）我在这里感觉到某种形式的动态编程，例如可以在“ leetcode” 中使用“ leet” 的解决方案。但是我无法在psuedocode中精确地表达它。我该怎么办？

Answer 1

简单方法：
排序单词列表。
对于具有二进制搜索的每个单词A（leetcode），请找到单词范围作为单词A（'le'，leet）的前缀。
对于每个有效的前缀，重复搜索该词的其余部分（即寻找etcode和code），依此类推

Answer 2

每个单词可以使用一次吗？例如，如果您有de和dede，（de，de）就是答案吗？为了简单起见，我只假设每个单词出现一次，您有很多单词，但是没有内存限制。

1-构建自定义树，这样每个节点都如下所示：

class node():
   is_there_a_word_that_ends_here = T/F
   children = dict() # other nodes, key is the letter of the node

然后，例如，如果您有三个单词，例如[“ ab”，“ abc”，“ ade”，“ c”]，那么您将拥有一棵下面的树（如果is_there_a_word_that_ends_here的node值为true，则放置* ）

head ---a---b*---c*
    |   |
    |   L-d----e*
    |
    L--c*

关于长度，

2-组字。从长度最小的单词开始，因为当您到达较大的单词时，您想知道较小单词的“分解”。在这里，您可以使用add_word_to_result函数来递归地执行此操作，该函数可以（应该）缓存结果。

results = dict() # keys: possible words you can reach, values: ways to reach them
for word in words_in_ascending_length:
   add_word_to_result(word, tree, results)

和add_word_to_result将开始在树中移动。如果在节点中看到is_there_a_word_that_ends_here，则调用add_word_to_result(remaining_part_of_the_word, tree, results)。例如，如果您有“ abc”，那么您会在“ ab”中看到*，然后调用add_word_to_result("c", tree, results)。

实现递归函数是问题的“有趣部分”（也是耗时的部分），所以我留给您解决。另外，作为奖励，您可以想到一种避免以有效方式将重复项添加到结果中的方法（因为在某些情况下会发生重复项）。

（编辑：也许您需要分别缓存现有单词和不存在单词的细目分类（例如单词尾部的细目分类），以便在返回结果之前不必将它们分开，如果这句话使任何意义）

我希望这会有所帮助。

奖金：示例代码（已经进行了实际测试，但是应该可以工作，虽然可以做出很大的改进，但是我现在还懒得去做。您可以稍微更改一下结构，以将results传递给{ {1}}，这样您就可以记住到目前为止所有可能的组合，因此，您只需使用{1}而不是add_word_to_result，而不必执行不必要的递归操作

add_word_to_result(head, head, words_left[1:], combinations, words_passed+words_left[0]+",")

Answer 3

在递归中，找出基本情况很重要。由于您要求所有组合，看起来您需要返回一个二维数组。

当我到达“”时，我应该返回什么？这是您在解决任何递归问题时应该问自己的主要问题。由于结构将是二维数组，因此我应该在点击“”时返回 [[]]，并在返回“LEETCODE”时填充它。

这是您必须实现的逻辑。从“Leetcode”中，我查看了所有给定的单词，发现“leet”是数组中存在的一个单词。您必须小心的一件事是，在 words 数组中给出的单词必须是前缀。例如，如果目标词是“eleetcode”，我们就不会使用“leet”，可能是“ele”或“eleet”。根据这些信息，您只需要实现编码。我会在 python 中：

class Solution:
    def construct(self,target:str,words:list[str]):
        # we stored in result
        result=[]
        # Base case
        if target=='':
            return [[]]
        # our tree will be branched out to len(words)
        for word in words:
            # we look for a word that is prefix to our target word
            
            if target.startswith(word):
                # if we found a word, then we are removing, and testing with the new target word
                suffix=target[len(word):]
                # we are recursively calling till the edge case
                bubble_up_ways=self.construct(suffix,words)
                # in each level I am adding the word as shown in the image
                combinations=[[*way,word] for way in bubble_up_ways]
                if combinations:
                    result.extend(combinations)
        return result

然而，这是一个蛮力解决方案。这意味着我们将对给定数组中的每个单词进行递归调用。我们肯定会有 len(words)=n 个分支。我们将进行递归调用，直到基本情况。现在达到基本情况的最坏情况是什么。想象一下我们有我们的数组 ["l","e","e","t","c","o","d","e"]。所以为了达到基本情况，我们将有 len("leetcode")=m 递归调用。 m 也称为高度。所以我们将有“n 次超过 m”次递归调用。此外，对于每个递归调用，我们正在对数组 combinations=[[*way,word] for way in bubble_up_ways] 进行切片，这也需要“m”时间。综上所述

  T:O(n^m * m) # this is exponential, because exponent is a variable

对于空间复杂性，我们在堆栈上进行了“m”次递归调用，并且我们正在存储长度为“n”的 result 数组。因为每个分支可能返回一个组合，所以我们必须存储“n”个数组。

  S: O(m*n)

在递归函数中，如果我们在返回之前存储每个结果，而不是进行相同的计算，我们将只从存储中检索结果。这样对于每个分支，我们不必在基本情况下进行“m”次递归调用。这里是记忆版本，我们只是传递一个 memo={} 存储计算结果

def memoized(self,target:str,words:List[str],memo={}):
    if target in memo:
        return memo[target]
    if target=="":
        return [[]]
    result=[]
    for word in words:
        if target.startswith(word):
            suffix=target[len(word):]
            buble_up_ways=self.memoized(suffix,words,memo)
            combinations=[[*way,word] for way in buble_up_ways]
            if combinations:
                result.extend(combinations)
    # memorize before returning value
    memo[target] = result
    return result

然而，与任何其他动态问题不同，这不会改变时间复杂度。因为如果我们知道会有重复，我们就会存储这些值，但是在我们用数组中的一个单词修剪“leetcode”之后，我们将得到不同的结果。例如

leetcode - leet ---> code # first branch
   leetcode - le   ---> etcode # second branch

查找构成单词的单词的所有组合

3 个答案: