说,我有一个50000个元素的字符串数组。使用For Next
搜索数组对于如此庞大的数组而言是如此之慢。有没有快速的搜索方式?
注意:使用join
& instr
我们可以在数组中搜索一个字符串,但是这个方法并不好,因为我找不到元素编号
注意:数组未排序。我正在寻找子串
答案 0 :(得分:3)
这是您使用Join
和InStr
:
Sub TestArraySearch()
Dim A(4) As String
A(0) = "First"
A(1) = "Second"
A(2) = "Third"
A(3) = "Fourth"
A(4) = "Fifth"
Debug.Print FastArraySearch(A, "Fi")
Debug.Print FastArraySearch(A, "o")
Debug.Print FastArraySearch(A, "hird")
Debug.Print FastArraySearch(A, "Fou")
Debug.Print FastArraySearch(A, "ndTh")
Debug.Print FastArraySearch(A, "fth")
End Sub
Function FastArraySearch(SearchArray As Variant,SearchPhrase As String) As String
Dim Pos As Long, i As Long, NumCharsProcessed As Long, Txt As String
Pos = InStr(Join(SearchArray, "§"), SearchPhrase)
If Pos > 0 Then
For i = LBound(SearchArray) To UBound(SearchArray)
NumCharsProcessed = NumCharsProcessed + Len(SearchArray(i)) + 1
If NumCharsProcessed >= Pos Then
FastArraySearch = SearchArray(i)
Exit Function
End If
Next i
End If
End Function
我没有对它进行基准测试,但它应该比每次循环执行单独搜索更快。它搜索一次,然后只是将字符串长度相加,直到它到达匹配的位置。由于字符串的长度存储在字符串中的任何字符之前,因此Len
函数已经过高度优化。
如果这种性能仍然是不可接受的,我认为你需要找到一个不同于数组的数据结构(例如,@ Remou建议的断开连接的记录集)。
答案 1 :(得分:3)
尝试使用过滤器( InputStrings ,值 [,包含 [,比较]] )功能。它返回匹配字符串的数组。
完整语法可在MSDN
上找到答案 2 :(得分:3)
您可以在需要多长时间内显示您正在使用的代码吗?还有,多久太久了?此代码读入50,000个字符串,并在300毫秒内找到包含子字符串的275。
Sub testarr()
Dim vaArr As Variant
Dim i As Long
Dim dTime As Double
Dim lCnt As Long
dTime = Timer
vaArr = Sheet1.Range("A1:A50000")
For i = LBound(vaArr, 1) To UBound(vaArr, 1)
If InStr(1, vaArr(i, 1), "erez") > 0 Then
lCnt = lCnt + 1
Debug.Print i, vaArr(i, 1)
End If
Next i
Debug.Print Timer - dTime
Debug.Print lCnt
End Sub
答案 3 :(得分:0)
这是一种快速返回子字符串出现次数的方法。希望它有所帮助!
Option Explicit
Option Compare Binary
Option Base 0
DefLng A-Z
Sub TestSubStringOccurence()
Dim GrabRangeArray() As Variant
Dim i As Long
Dim L As Long
Dim RunTime As Double
Dim SubStringCounter As Long
Dim J As Long
Dim InStrPosition As Long
Dim Ws As Excel.Worksheet
Set Ws = ThisWorkbook.Sheets("Sheet1")
RunTime = Timer
With Ws
For i = 1 To 50000
If i Mod 2 = 0 Then .Cells(i, 1).Value2 = "1 abcdef 2 abcdef 3 abcdef 4 abcdef 5 abcdef" _
Else .Cells(i, 1).Value2 = i Next i
GrabRangeArray = .Range("a1:a50000").Value
End With
RunTime = Timer
'returns number of substring occurrences
For i = 1 To UBound(GrabRangeArray, 1)
InStrPosition = 1
Do
InStrPosition = InStr(InStrPosition, GrabRangeArray(i, 1), "abcdef", vbBinaryCompare)
If InStrPosition <> 0 Then
SubStringCounter = SubStringCounter + 1
InStrPosition = InStrPosition + 6
End If
Loop Until InStrPosition = 0
Next i
Debug.Print "Runtime: " & Timer - RunTime & ", ""abcdef"" occurences: " & SubStringCounter
End Sub
这是测试子字符串是否存在的快速方法,但不返回子字符串出现次数。
Option Explicit
Option Compare Binary
Option Base 0
DefLng A-Z
Sub TestSubStringOccurence()
Dim GrabRangeArray() As Variant
Dim I As Long
Dim L As Long
Dim RunTime As Double
Dim SubStringCounter As Long
Dim J As Long
Dim InStrPosition As Long
Dim Ws As Excel.Worksheet
Const ConstABCDEFString As String = "abcdef"
Dim B As Boolean
Set Ws = ThisWorkbook.Sheets("Sheet1")
RunTime = Timer
ReDim GrabRangeArray(0 To 49999)
With Ws
For I = 1 To 50000
If I Mod 2 = 0 Then GrabRangeArray(I - 1) = "1 abcdef 2 abcdef 3 abcdef 4 abcdef 5 abcdef" _
Else GrabRangeArray(I - 1) = I - 1
Next I
.Range("a1:a50000").Value = Application.Transpose(GrabRangeArray)
End With
RunTime = Timer
For I = 1 To UBound(GrabRangeArray, 1)
If InStrB(1, GrabRangeArray(I), ConstABCDEFString, vbBinaryCompare) Then _
SubStringCounter = SubStringCounter + 1
Next I
Debug.Print "Runtime: " & Timer - RunTime & ", ""abcdef"" occurences: " & SubStringCounter
End Sub
答案 4 :(得分:0)
在VB6中加速任何数组索引操作的首要方法是使用以下选项重新编译组件:
现在,您的数组索引应该与等效的C / C ++操作一样快。
唯一的问题是您应该确保您的代码从不引用其正常数组边界之外的索引。以前,您将收到VB运行时错误。在此之后,您可能会获得访问冲突。
答案 5 :(得分:0)
我使用了Joins
和Splits
,但没有做任何benchmark
:
Function IndexOf(ByRef arr() As String, ByVal str As String) As Integer
Dim joinedStr As String
Dim strIndex As Integer
joinedStr = "|" & Join(arr, "|")
strIndex = InStr(1, joinedStr, str)
If strIndex = 0 Then
IndexOf = -1
Exit Function
End If
joinedStr = Mid(joinedStr, 1, strIndex - 1)
IndexOf = UBound(Split(joinedStr, "|")) - 1
End Function