应用错误收集

Where to start: what set of N letters makes the most words?

时间：2016-10-19 17:39:41

标签： algorithm

I'm having trouble coming up with a non-brute force approach to solve this problem I've been wondering: what set of N letters can be used to make the most words from a given dictionary? Letters can be used any number of times.

For example, for N=3, we can have EST to give words like TEST and SEE, etc...

Searching online, I found some answers (such as listed above for EST), but no description of the approach.

My question is: what well-known problems are similar to this, or what principles should I use to tackle this problem?

NOTE: I know it's not necessarily true that if EST is the best for N=3, then ESTx is the best for N=4. That is to say, you can't just append a letter to the previous solution.

In case you're wondering, this question came to mind because I was wondering what set of 4 ingredients could make the most cocktails, and I started searching for that. Then I realized my question was specific, and so I figured this letter question is the same type of problem, and started searching for it as well.

2 个答案:

答案 0 :(得分：2)

对于字典中的每个单词，将其排序并删除重复项。让它成为单词的 skeleton 。对于每个骨架，计算包含它的单词数。让它成为频率。忽略大小高于N的所有骨架。

设子骨架是骨架中1个或多个字母的任何可能删除，即EST具有E，S，T，ES，ET，ST的子骨架。对于大小为N的每个骨架，添加此骨架及其所有子骨架的计数。选择具有最大总和的骨架。

你需要O（2 ** N * D）次操作，其中D是字典的大小。

更正：我们需要考虑所有大小为N（不仅是单词）的骨架，并且操作的numbet将是O（2 ** N * C（L，N）），其中L是字母数量（英文为26）。

答案 1 :(得分：0)

所以我编写了一个解决这个问题的解决方案，它使用哈希表来完成工作。我不得不一路上处理一些问题！

让N为您正在寻找的字母组的大小，这些字母可以创造最多的单词。让L为字典的长度。
将字典中的每个单词转换为一组字母：'test' -> {'e','s','t'}
对于包含1到N的每个数字，创建一个剪切列表，其中包含您可以使用完全多个字母创建的单词。
为每个数字1到N包含一个哈希表，然后浏览相应的剪切列表并使用该集作为键，并为剪切列表的每个成员递增1。
这是给我带来麻烦的部分！为N创建一个剪切列表（unique_cut_list）的集合。这实际上是N的哈希表的所有填充的键值对。
对于unique_cut_list中的每个集合，生成所有子集，并检查相应的哈希表（子集的大小）以查看是否存在值。如果有，请使用原始集的键将该值添加到N的哈希表中。
最后，浏览哈希表并找到最大值。相应的密钥是您所追求的字母组。

对于步骤1-5，您将通过字典1 + 2N次，步骤6浏览字典版本并每次检查（2 ^ N）-1个子集（忽略空集）。得到O（2NL + L * 2 ^ N），其应接近O（L * 2 ^ N）。不错，因为N在大多数应用程序中都不会太大！