如何在htmldoc中查找查询字符串值?

时间:2011-12-05 01:54:17

标签: asp.net regex vb.net

我正在尝试从htmldocument中提取查询字符串值。它包含许多带有名为id的查询字符串参数的锚链接。我想用逗号分隔的字符串中的所有id。我怎样才能解决这个问题?所以我想得到:结果= {1,2,3,4,5}

vb.net代码:

Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

        Dim str As String() = GetParagraphs(System.IO.File.ReadAllText(Server.MapPath("TextFile1.html")))

        Response.Write(str)

    End Sub

    Private Shared Function GetParagraphs(ByVal data As String) As String()

        Dim result As New List(Of String)
        Dim m As Match = Regex.Match(data, "http://mywebsite.com/mydetails.aspx?id")
        While (m.Success)
            result.Add(m.Value)
            m = m.NextMatch()
        End While
        Return result.ToArray()
    End Function

TextFile.html

<a href="http://mywebsite.com/mydetails.aspx?id=1"
            target="_blank"></a>

            <a href="http://mywebsite.com/mydetails.aspx?id=2"
                target="_blank"></a>


                <a href="http://mywebsite.com/mydetails.aspx?id=3"
                    target="_blank"></a>


                    <a href="http://mywebsite.com/mydetails.aspx?id=4"
                        target="_blank"></a>


                        <a href="http://mywebsite.com/mydetails.aspx?id=5"
                            target="_blank"></a>

1 个答案:

答案 0 :(得分:0)

您可以对GetParagraphs方法使用此修改:

Private Shared Function GetParagraphs(ByVal data As String) As String()

    Dim result As New List(Of String)
    ' Define what we are looking for
    Const MY_MATCH As String = "http://mywebsite.com/mydetails.aspx?id="
    ' Replace the ? with \? so that regex finds the correct string
    Dim m As Match = Regex.Match(data, MY_MATCH.Replace("?", "\?"))
    While (m.Success)
        Dim wStartIndex As Integer
        Dim wEndIndex As Integer

        ' Jump to the end of the found string
        wStartIndex = m.Index + MY_MATCH.Length
        ' Now find the end of the href string
        wEndIndex = data.IndexOf("""", wStartIndex)
        ' If we found something
        If wEndIndex <> -1 Then
            ' Extract the value from the string
            result.Add(data.Substring(wStartIndex, wEndIndex - wStartIndex))
        End If
        m = m.NextMatch()
    End While
    Return result.ToArray()
End Function