试图找出一种将字符串与列进行比较时计算最小百分比匹配的方法。
示例:
Column A Column B
Key Keylime
Key Chain Status
Serious
Extreme
Key
哪里
Column A Column B Column C Column D
Key Temp 100% Key
Key Chain Status 66.7% Key Ch
Ten Key Ch 100% Tenure
Extreme
Key
Tenure
对此进行扩展:
要在C列上展开-当查看Key Chain
时-与B列中任何单词的最高匹配项是Key Ch
,其中{{9个字符中的6个(包括空格) 1}}匹配,百分比匹配为(6/9)= 66.7%
Key Chain
之类的示例时,如果无法对比赛进行惩罚,上述逻辑就会失败。 Ten
的3个字符中有3个与Ten
匹配,这给它带来了100%的夸张匹配,我仍然想不出一种纠正方法。答案 0 :(得分:1)
这应该可以工作(我尚未测试,目前在Linux上)。为每个字符串调用getStrMatch
。
Type StrMatch
Percent As Double
Word As String
End Type
Function getStrMatch(s As String, RefRange As Range) As StrMatch
Dim i As Long, ref_str As String
Dim BestMatch As StrMatch: BestMatch.Percent = -1
Dim match_pc As Double
With RefRange
For i = 1 to .Cells.Count
ref_str = .Cells(i).Value2
match_pc = getMatchPc(s, ref_str)
If match_pc > BestMatch.Percent Then
BestMatch.Percent = match_pc
BestMatch.Word = ref_str
End If
Next i
End With
getStrMatch = BestMatch
End Function
Function getMatchPc(s As String, ref_str As String) As Double
Dim s_len As Long: s_len = Len(s)
Dim ref_len As Long: ref_len = Len(ref_str)
Dim longer_len as Long
If s_len > ref_len Then longer_len = s_len Else longer_len = ref_len
Dim m As Long: m = 1
While m <= longer_len
If Mid(s, m, 1) <> Mid(ref_str, m, 1) Then Exit While
m = m + 1
Wend
getMatchPc = (m - 1.0) / longer_len
End Function
请注意,您必须将其放入模块中,否则必须声明Private Type
和Private Function
。
此外,如果您要匹配很多字符串,则可能应该创建一个trie,因为这仅是幼稚的字符串比较,每个getStrMatch的成本为O(mn),其中m是RefRange
的大小n是平均ref_str
长度。