Question

给定字典列表和输入字，如果输入字的长度与字典中的词汇长度相同，则返回true。

dictionary = ["apple", "testing", "computer"];
singleType(dictionary, "adple") // true
singleType(dictionary, "addle") // false
singleType(dictionary, "apple") // false
singleType(dictionary, "apples") // false

如果我们忽略hashmap所需的预处理时间，我提出了一个在线性时间内运行的解决方案。

O(k*26) => O(k)，其中k = length of the input word

我的线性解决方案就像将字典列表转换为哈希映射，其中键是单词，值是布尔值，然后遍历输入单词中的每个字符，并用1替换每个字符26字母并检查它是否映射到哈希图。

但是他们说我可以比O(k*26)做得更好，但是怎么做？

Answer 1

您可以使用包含单个拼写错误的单词的所有变体扩展字典，但是您只需添加一些＆＃34;通配符＆＃34;而不是实际拼写错误。该地方的?或*字符。然后，您可以检查（a）单词是否在正确拼写的单词集中，以及（b）用相同的通配符替换单词中的任何字母，该单词可以在具有相同通配符的单词集中找到一个错字。

Python中的示例：

>>> dictionary = ["apple", "testing", "computer"]
>>> wildcard = lambda w: [w[:i]+"?"+w[i+1:] for i in range(len(w))]
>>> onetypo = {x for w in dictionary for x in wildcard(w)}
>>> correct = {w for w in dictionary}
>>> word = "apxle"
>>> word not in correct and any(w in onetypo for w in wildcard(word))
True

这将查找的复杂性降低到仅O（k），即仍然是字母数的线性，但没有高常数因子。然而，它确实使字典大大夸大了一个等于单词中平均字母数的因子。

Answer 2

对于单个查找，我将按字长过滤字典，然后一旦错误计数> 1，就迭代字，计算错误，并挽救每个字。 1.

val dictionary = List ("affen", "ample", "apple", "appse", "ipple", "appl", "pple", "mapple", "apples")

@annotation.tailrec
def oneError (w1: String, w2:String, err: Int) : Boolean = w1.length match {
    case 0 => err == 1
    case _ => if (err > 1) false else {
        if (w1(0) == w2(0)) oneError (w1.substring (1),  w2.substring (1), err) else
        oneError (w1.substring (1),  w2.substring (1), err + 1)
    }
}

scala> dictionary.filter (_.length == 5).filter (s => oneError ("appxe", s, 0))
res5: List[String] = List(apple, appse)

为了处理更长的文本，我会预处理字典并将其拆分为地图（word.length - ＆gt; List（words））。

对于高度冗余的自然语言，我会从文本中构建一组唯一的单词，只查找一次单词。

对于单字查找，最坏的情况是对初始函数的n次调用，其中n = max（dictionary.groupBy（w.length））。

每个单词查找（单词长1）将至少执行2步直到失败，但大多数单词（假设没有病态输入和字典）仅访问2个步骤。从剩下的那些，大多数被排除在3个步骤之后，依此类推。

这是一个版本，显示它看起来有多深：

def oneError (word: String) : Array[String] = {

    @tailrec
    def oneError (w1: String, w2:String, steps: Int, err: Int) : Boolean = w1.length match {
        case 0 => {print (s"($steps) "); err == 1}
        case _ => if (err > 1) {print (s"$steps "); false } else {
            if (w1(0) == w2(0)) oneError (w1.substring (1),  w2.substring (1), steps +1, err) else
            oneError (w1.substring (1),  w2.substring (1), steps + 1, err + 1)
        }
    }

    val d = dict (word.length)
    println (s"Info: ${d.length} words of same length")
    d.filter (entry => oneError (word, entry, 0, 0))
}

示例输出，编辑：

scala> oneError ("fuck") 
Info: 3352 words of same length
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2  
2 2 2 2 2 2 2 2 (4) 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 (4) (4) 3 3 3 3 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 3 3 (4) (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
3 3 3 3 3 3 (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
3 (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) 3 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 (4) (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
3 3 3 (4) 3 3 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 (4) 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
res53: Array[String] = Array(Buck, Huck, Puck, buck, duck, funk, luck, muck, puck, suck, tuck, yuck)

Answer 3

听起来你正在寻找关于字典条目的模式1的编辑距离。例如，如果模式为“adple”且字典条目为“apple”，则会导致编辑距离为1。您还有一个额外的约束，即模式与字典条目的长度相同，但这很容易实现。

算法 - 检查是否存在具有给定单词的单一类型

3 个答案: