我的sub比较两个字符串列表并返回最接近的匹配项。我发现这个子被绊倒了一些常见的词,例如"""和"设施"。我想编写一个函数,提供一个单词数组来排除和检查这些单词的每个字符串,如果找到则排除它们。
以下是输入示例:
|aNames | bNames | words to exclude
|thehillcrest |oceanview health| the
|oceanview, the|hillCrest | health
预期输出:
|aResults |bResuts
|hillcrest |hillcrest
|oceanview |oceanview
到目前为止,我有:
Dim ub as Integer
Dim excludeWords() As String
'First grab the words to be excluded
If sheet.Cells(2, 7).Value <> "" Then
For y = 2 To sheet.Range("G:G").End(xlDown).Row
ub = UBound(excludeWords) + 1 'I'm getting a subscript out of range error here..?
ReDim Preserve excludeWords(0 To ub)
excludeWords(ub) = sheet.Cells(y, 7).Value
Next y
End If
然后我的比较函数,使用双循环,将比较A列中的每个字符串和B列。在比较之前,a和b列中的值将通过我们的函数,该函数将检查要排除的这些单词。可能没有要排除的单词,因此参数应该是可选的:
Public Function normalizeString(s As String, ParamArray a() As Variant)
if a(0) then 'How can I check?
for i = 0 to UBound(a)
s = Replace(s, a(i))
next i
end if
normalizeString = Trim(LCase(s))
End Function
此代码中可能只有一些部分无法正常工作。你能指出我正确的方向吗?
谢谢!
答案 0 :(得分:6)
要将列表存储在数组中,您可以执行此操作
Sub Sample()
Dim excludeWords As Variant
Dim lRow As Long
With Sheet1 '<~~ Change this to the relevant sheet
'~~> Get last row in Col G
lRow = .Range("G" & .Rows.Count).End(xlUp).Row
excludeWords = .Range("G2:G" & lRow).Value
'Debug.Print UBound(excludeWords)
'For i = LBound(excludeWords) To UBound(excludeWords)
'Debug.Print excludeWords(i, 1)
'Next i
End With
End Sub
然后将数组传递给您的函数。上面的数组是一个2D数组,因此需要进行相应处理(参见上面代码中的注释部分)
也像我在上面的评论中提到的那样
oceanview, the
如何成为Oceanview
?您可以替换the
,但这会给您oceanview,
(注意逗号),而不是Oceanview
。
您可能必须将这些特殊字符传递给工作表中的Col G,或者您可以使用循环在函数中处理它们。为此,您将必须使用ASCII字符。请参阅this
通过评论进行跟进
这是我快速编写的内容,因此未经过广泛测试。这是你在找什么?
Sub Sample()
Dim excludeWords As Variant
Dim lRow As Long
With Sheet1
lRow = .Range("G" & .Rows.Count).End(xlUp).Row
excludeWords = .Range("G2:G" & lRow).Value
'~~> My column G has the word "habilitation" and "this"
Debug.Print normalizeString("This is rehabilitation", excludeWords)
'~~> Output is "is rehabilitation"
End With
End Sub
Public Function normalizeString(s As String, a As Variant) As String
Dim i As Long, j As Long
Dim tmpAr As Variant
If InStr(1, s, " ") Then
tmpAr = Split(s, " ")
For i = LBound(a) To UBound(a)
For j = LBound(tmpAr) To UBound(tmpAr)
If LCase(Trim(tmpAr(j))) = LCase(Trim(a(i, 1))) Then tmpAr(j) = ""
Next j
Next i
s = Join(tmpAr, " ")
Else
For i = LBound(a) To UBound(a)
If LCase(Trim(s)) = LCase(Trim(a(i, 1))) Then
s = ""
Exit For
End If
Next i
End If
normalizeString = Trim(LCase(s))
End Function
答案 1 :(得分:5)
首先,你不能为没有大小的数组调用 UBound 函数:
Dim excludeWords() As String
ub = UBound(excludeWords) + 1 'there is no size yet
要删除一些不需要的字词,请使用替换功能
String1 = Replace(String1, "the", "")
要进行您所描述的比较,我会使用赞功能。这是文档。 http://msdn.microsoft.com/pl-pl/library/swf8kaxw.aspx