我编写了一个简短的函数来在Excel工作簿中使用另一组字符串搜索一组字符串。功能如下:
Function CheckForString(Target As Range, List As Range)
Application.ScreenUpdating = False
Dim Output As String
Output = "No Match"
For Each Item In List
If Output = "No Match" Then
If Not Target.Find(Item.Value) Is Nothing Then Output = Item.Value
End If
Next
CheckForString = Output
Application.ScreenUpdating = True
End Function
问题是我正在使用此函数在大约400,000个字符串中搜索大约2,000个其他字符串中的任何一个。我已经设置了它,让它运行了几个小时,它没有完成计算。
因此工作表中有400,000个= CheckForString()实例,它们在~2,000个单元格范围内循环。我需要做的就是查看2,000个单元格中是否有任何字符串出现在每个400,000个单元格中。 E.g:
字符串:“APPLES-BANANAS 123459”
要查找的字符串:
APPLES-BANANAS
APPLES BANANAS
因此,如果它“击中”第一个,我不在乎它是否找到了。我只需要知道要找到的至少一个字符串就在那里。
感谢您对如何提高速度的想法!
最佳,
Grrollins
编辑:到目前为止评论中的问题(并感谢您抽出宝贵的时间!)
我搜索的数据是包含序列号(字母数字)的字符串,但没有一致的格式,包括额外的垃圾,空格,其他字符等。
我有一组原始的序列号,我试图与大数据转储进行比较。目标是确定大集合中的哪些字符串包含可以是我的列表中的序列号的字符串,因此我可以进一步查看这些记录并使用它们创建报告。
我希望能让它更清晰一点!再次感谢!
我也会尝试退出。如果有的话,也许它会刮几个小时!
答案 0 :(得分:1)
子程序MarkStrings
将以绿色突出显示Target
中包含来自List
范围的子字符串的所有字符串。
主要观点:
Strings
函数而不是WorksheetFunction
函数(同样,方式更快)。当然,你可以重新使用Sub
做你想做的事。请注意与您的Function
的想法不一样。虽然您的函数被调用了很多单元格,但是对于Target
范围内的所有Test()
范围,应该调用此子例程一次 - 请参阅 Public Sub Test()
Call MarkStrings(Sheet1.Range("C3:DG303"), Sheet1.Range("A1:B2"))
End Sub
Public Sub MarkStrings( _
ByVal Target As Range, _
ByVal List As Range _
)
Dim raw As Variant
Dim str_target() As String
Dim str_list() As String
Dim m As Long, m_min As Long, m_max As Long
Dim n As Long, n_min As Long, n_max As Long
Dim p As Long, p_min As Long, p_max As Long
Dim q As Long, q_min As Long, q_max As Long
' 0. Check ranges '
If (Target Is Nothing) Or (List Is Nothing) Then
Exit Sub
End If
Let Application.ScreenUpdating = False
' 1. Load the entire Target in memory, and make it string '
Let raw = Target.Value
Let m_min = LBound(raw, 1)
Let m_max = UBound(raw, 1)
Let n_min = LBound(raw, 2)
Let n_max = UBound(raw, 2)
ReDim str_target( _
m_min To m_max, _
n_min To n_max _
)
For m = m_min To m_max
For n = n_min To n_max
Let str_target(m, n) = CStr(raw(m, n))
Next n
Next m
Let raw = Empty
' 2. Load the entire List in memory, and make it string '
Let raw = List.Value
Let p_min = LBound(raw, 1)
Let p_max = UBound(raw, 1)
Let q_min = LBound(raw, 2)
Let q_max = UBound(raw, 2)
ReDim str_list( _
p_min To p_max, _
q_min To q_max _
)
For p = p_min To p_max
For q = q_min To q_max
Let str_list(p, q) = CStr(raw(p, q))
Next q
Next p
Let raw = Empty
' 3. Loop trough Target and check elements in List. If found, '
' make cell background green and go to next target. '
For m = m_min To m_max
For n = n_min To n_max
For p = p_min To p_max
For q = q_min To q_max
If Strings.InStr( _
Start:=1, _
String1:=str_target(m, n), _
String2:=str_list(p, q), _
Compare:=vbTextCompare _
) > 0 Then
Let Target.Cells(m, n).Interior.Color = vbGreen
GoTo NEXT_TARGET
End If
Next q
Next p
NEXT_TARGET:
Next n
Next m
Let Application.ScreenUpdating = True
End Sub
子例程。
{{1}}
搜索范围约为4的4个字符串。 300×100在我的机器上花了一秒钟。对于你的情况,它应该采取(400000×2000)/(4×30000)〜= 6700秒〜= 2小时。
答案 1 :(得分:0)
Appologies但我不明白为什么我们不使用range.find方法。场景是目标范围可能有4,00,000行,需要搜索近2000行。使用find循环将只有2000次。并且只会在几秒内得到结果,代码会更快
Function myfind(targetrng As Range, sourcerng As Range)
On Error Resume Next
Dim c As Range
Dim cell As Range
Set c = targetrng.Find(what:=cell.Value, LookIn:=xlValues, lookat:=xlPart, searchorder:=xlByRows, searchdirection:=xlNext)
If c Is Nothing Then
myfind = "No Match"
Else
myfind = "Match Found"
End If
End Function