如何查找大字符串(大于2 MB)是否包含任何项目列表?
我尝试过
Dim Lit as New List (of String)
For x as Integer = 0 To 20000
Lit.Add(x)
Next
If Lit.Any(Function(y) mytext.IndexOf(y, StringComparison.InvariantCulture) >= 0) Then
'Code
End If
但是需要10秒。我该如何加快速度?
答案 0 :(得分:0)
这将更快。 Lit
是要在mytext
中搜索的字符串的哈希集。 mytext
字符串仅从索引0开始扫描一次。从mytext
中提取子字符串以获取所有可能的搜索字符串长度,并对每个子字符串进行哈希集查找。
Dim Lit As New HashSet(Of String)
For x As Integer = 0 To 20000
Lit.Add(x)
Next
' Build a list of the lengths of the Lit strings.
Dim lengths As New HashSet(Of Integer)
For Each s As String In Lit
lengths.Add(s.Length)
Next
Dim counts As List(Of Integer) = lengths.OrderByDescending(Of Integer)(Function(x) x).ToList
' Scan mytext from index 0, extract substrings of all possible counts, and see if the string is Lit dictionary.
For i As Integer = 0 To mytext.Length - counts.First
Dim search As String = mytext.Substring(i, counts.First)
For Each c In counts
search = search.Substring(0, c)
If Lit.Contains(search) Then
' Found search in mytext.
End If
Next
Next
答案 1 :(得分:0)
在我的旧系统上,仅在一个简单的循环中使用.contains实际上是瞬时的,并且在特定情况下,以高索引(20,000)开始会使它更快地提高。
Dim Result As Boolean = True
For x As Integer = 20000 To 1 Step -1
If Not MyText.Contains(Lit(x).ToString) Then
'Console.WriteLine("Unfound:" & x.ToString)
Result = False
Exit For
End If
Next