我目前正在尝试开发一个从某个网页获取数据的应用程序。
让我们说这个网页有以下内容:
<needle1>HAYSTACK 1<needle2>
<needle1>HAYSTACK 2<needle2>
<needle1>HAYSTACK 3<needle2>
<needle1>HAYSTACK 4<needle2>
<needle1>HAYSTACK 5<needle2>
我有以下VB.NET代码:
Dim webClient As New System.Net.WebClient
Dim FullPage As String = webClient.DownloadString("PAGE URL HERE")
Dim ExtractedInfo As String = GetBetween(FullPage, "<needle1>", "<needle2>")
GetBetween是以下函数:
Function GetBetween(ByVal haystack As String, ByVal needle As String, ByVal needle_two As String) As String
Dim istart As Integer = InStr(haystack, needle)
If istart > 0 Then
Dim istop As Integer = InStr(istart, haystack, needle_two)
If istop > 0 Then
Dim value As String = haystack.Substring(istart + Len(needle) - 1, istop - istart - Len(needle))
Return value
End If
End If
Return Nothing
End Function
通过使用提到的代码,ExtractedInfo总是等于“HAYSTACK 1”,因为它始终从它找到的第一个匹配项中获取干草堆。
我的问题是:如何将ExtractedInfo设置为某种数组,以便查找第二,第三,第四等......出现。
类似的东西:
ExtractedInfo(1) = HAYSTACK 1
ExtractedInfo(2) = HAYSTACK 2
提前致谢!
答案 0 :(得分:1)
Dim webClient As New System.Net.WebClient
Dim FullPage As String = webClient.DownloadString("PAGE URL HERE")
Dim ExtractedInfo As List (Of String) = GetBetween(FullPage, "<needle1>", "<needle2>")
Function GetBetween(ByVal haystack As String, ByVal needle As String, ByVal needle2 As String) As List(Of String)
Dim result As New List(Of String)
Dim split1 As String() = Split(haystack, needle).ToArray
For Each item In split1
Dim split2 As String() = Split(item, needle2)
Dim include As Boolean = True
For Each element In split2
If include Then
If String.IsNullOrWhiteSpace(element) = False Then result.Add(element)
End If
include = Not include
Next element
Next item
Return result
End Function