所以我的用户在注册表中写了他们的地址,但很多人都有拼写错误。我从城市记录中检索了另一个列表,其中包含这些地址的正确拼写。所以,让我说我有" Brooklny"由他们输入,我有正确的名字列表:布鲁克林,曼哈顿,布朗克斯,史坦顿岛,皇后区(这是一个例子,实际地址是西班牙语,并参考墨西哥城的邻里)。
我想找到布鲁克林和每个自治市镇名称之间的编辑距离,然后找到布鲁克林有最小编辑距离的单词。
所以编辑之间的距离:Brooklny-Brooklyn是2,Brooklny-Bronx是4,依此类推。布鲁克林当然最低2分。
想象一下,布鲁克林在小区A1和布鲁克林,曼哈顿,布朗克斯,史坦顿岛,皇后区各有一个小区,来自B1:B6
我在VBA中为Excel中的用户定义函数执行此操作,到目前为止,我有这段代码,但它不起作用。
Function Minl(ByVal string1 As String, ByVal correctos As Range) As Variant
Dim distancias(3) As Integer
Dim i, minimo As Integer
i = 0
For Each c In correctos.Cells
distancias(i) = Levenshtein(string1, c.Value)
i = i + 1
Next c
Minl = Minrange(distancias)
End Function
Function Levenshtein(ByVal string1 As String, ByVal string2 As String) As Long
Dim i As Long, j As Long
Dim string1_length As Long
Dim string2_length As Long
Dim distance() As Long
string1_length = Len(string1)
string2_length = Len(string2)
ReDim distance(string1_length, string2_length)
For i = 0 To string1_length
distance(i, 0) = i
Next
For j = 0 To string2_length
distance(0, j) = j
Next
For i = 1 To string1_length
For j = 1 To string2_length
If Asc(Mid$(string1, i, 1)) = Asc(Mid$(string2, j, 1)) Then
distance(i, j) = distance(i - 1, j - 1)
Else
distance(i, j) = Application.WorksheetFunction.Min _
(distance(i - 1, j) + 1, _
distance(i, j - 1) + 1, _
distance(i - 1, j - 1) + 1)
End If
Next
Next
Levenshtein = distance(string1_length, string2_length)
End Function
Function Minrange(ParamArray values() As Variant) As Variant
Dim minValue, Value As Variant
minValue = values(0)
For Each Value In values
If Value < minValue Then minValue = Value
Next
Minrange = minValue
End Function
我认为算法是正确的但我认为我可能在使用语法时遇到问题。 levenshtein函数有效,但我不确定其他两个。
答案 0 :(得分:0)
要获得最接近的输出,您可以使用它:
Function get_match(ByVal str As String, rng As Range) As String
Dim itm As Variant, outp(0 To 2) As Variant
outp(1) = 0: outp(2) = ""
For Each itm In rng.Text
outp(0) = Levenshtein(itm, str)
If outp(0) = 0 Then
get_match = itm
Exit Function
ElseIf outp(1) = 0 Or outp(0) < outp(1) Then
outp(1) = outp(0)
outp(2) = itm
End If
Next
get_match = outp(1)
End Function
稍后获取距离,您只需运行Levenshtein(string,get_match(string,range))
仍然......我不确定你在寻找什么:/