我正在搜索文件中的一串单词。例如"一两三"。我一直在使用:
Dim text As String = File.ReadAllText(filepath)
For each phrase in phrases
index = text.IndexOf(phrase, StringComparison.OrdinalIgnoreCase)
If index >= 0 Then
Exit For
End If
Next
它工作正常,但现在我发现有些文件可能包含目标短语,而单词之间的空白间隔不止一个。
例如我的代码找到
" one two three
"但未能找到" one two three
"
有没有一种方法可以使用正则表达式或任何其他技术捕获短语,即使单词之间的距离超过一个空格?
我知道我可以使用
Dim text As String = File.ReadAllText(filepath)
For each phrase in phrases
text=text.Replace(" "," ")
index = text.IndexOf(phrase, StringComparison.OrdinalIgnoreCase)
If index >= 0 Then
Exit For
End If
Next
但我想知道是否有更有效的方法来实现这个目标
答案 0 :(得分:1)
您可以创建一个删除任何双重空格的函数。
Option Strict On
Option Explicit On
Option Infer Off
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim testString As String = "one two three four five six"
Dim excessSpacesGone As String = RemoveExcessSpaces(testString)
'one two three four five six
Clipboard.SetText(excessSpacesGone)
MsgBox(excessSpacesGone)
End Sub
Function RemoveExcessSpaces(source As String) As String
Dim result As String = source
Do
result = result.Replace(" ", " "c)
Loop Until result.IndexOf(" ") = -1
Return result
End Function
End Class
答案 1 :(得分:1)
代码中的注释将解释代码
Dim inputStr As String = "This contains one Two three and some other words" '<--- this be the input from the file
inputStr = Regex.Replace(inputStr, "\s{2,}", " ") '<--- Replace extra white spaces if any
Dim searchStr As String = "one two three" '<--- be the string to be searched
searchStr = Regex.Replace(searchStr, "\s{2,}", " ") '<--- Replace extra white spaces if any
If UCase(inputStr).Contains(UCase(searchStr)) Then '<--- check if input contains search string
MsgBox("contains") '<-- display message if it contains
End If
答案 2 :(得分:0)
您可以将短语转换为每个单词之间带有\s+
的正则表达式,然后检查文本是否匹配。 e.g。
Dim text = "This contains one Two three"
Dim phrases = {
"one two three"
}
' Splits each phrase into words and create the regex from the words.
For each phrase in phrases.Select(Function(p) String.Join("\s+", p.Split({" "c}, StringSplitOptions.RemoveEmptyEntries)))
If Regex.IsMatch(text, phrase, RegexOptions.IgnoreCase) Then
Console.WriteLine("Found!")
Exit For
End If
Next
请注意,这不会检查短语开头/结尾的单词边界,因此"This contains someone two threesome"
也会匹配。如果您不想这样,请在正则表达式的两端添加"\s"
。