如何在输入元素之前获取标签的内部文本?

时间:2012-01-01 17:51:24

标签: vb.net html-agility-pack

我的应用正在使用htmlagility pack。截至目前,我可以获取表单上的所有输入元素。问题是我按ID获取所有输入元素。我试图将其缩小到只给我一个ID的表单的输入元素,在每个输入元素之前包含精确的内部文本标签。

示例:

<label for="email">Email Address:</label>
<input type="text" class="textbox" name="email" id="email" maxlength="50" value="" dir="ltr" tabindex="1" 

我正在尝试获取具有行进标签的输入,其内部文本为&#34;电子邮件地址&#34;

我怎么说这个?

这是我的应用程序,它通过ID抓取所有输入元素。

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

    Dim doc As HtmlDocument
    Dim web As New HtmlWeb
    doc = web.Load("http://shaggybevo.com/board/register.php")
    Dim docNode As HtmlNode = doc.DocumentNode
    Dim nodes As HtmlNodeCollection = docNode.SelectNodes("//input")
    'SelectNodes takes a XPath expression
    For Each node As HtmlNode In nodes
        'Get all input elements by id
        Dim id As String = node.GetAttributeValue("value", "id")

        'print all input elements by id to form2 richtextbox
        Form2.RichTextBox1.Text = Form2.RichTextBox1.Text & Environment.NewLine & id.ToString & name.ToString()
        Form2.Show()

    Next

End Sub

谢谢大家......我不得不说我已经学习VB.NET一段时间了,到目前为止这个论坛已经很棒了......很高兴我找到了它......

1 个答案:

答案 0 :(得分:0)

此处的基本概念是获取for属性与关联的input的ID匹配的标签。

因此,我们首先遍历标签并将标签的文本记录在由for值键入的字典中,然后我们循环遍历inputs并且如果输入的id在在字典中,我们从字典中检索值(标签文本)并显示它。

请注意,我还修改了数据的收集方式以提高效率(几乎在连接字符串时,你应该使用stringbuilder)。

这是重写的代码:

    Dim web As HtmlAgilityPack.HtmlWeb = New HtmlWeb()
    Dim doc As HtmlAgilityPack.HtmlDocument = web.Load("http://shaggybevo.com/board/register.php")
    Dim nodes As HtmlNodeCollection

    ' Keeps track of the labels by the associated control id
    Dim labelText As New System.Collections.Generic.Dictionary(Of String, String)

    ' First, get the labels
    nodes = doc.DocumentNode.SelectNodes("//label")

    If nodes IsNot Nothing Then
        For Each node In nodes
            If node.Attributes.Contains("for") Then
                Dim sFor As String

                ' Extract the for value
                sFor = node.Attributes("for").Value

                ' If it does not exist in our dictionary, add it
                If Not labelText.ContainsKey(sFor) Then
                    labelText.Add(sFor, node.InnerText)
                End If
            End If
        Next
    End If

    nodes = doc.DocumentNode.SelectNodes("//input")

    Dim sbText As New System.Text.StringBuilder(500)

    If nodes IsNot Nothing Then
        For Each node In nodes
            ' See if this input is associated with a label
            If labelText.ContainsKey(node.Id) Then
                ' If it is, add it to our collected information
                sbText.Append("Label = ").Append(labelText(node.Id))
                sbText.Append(", Id = ").Append(node.Id)

                sbText.AppendLine()
            End If
        Next
    End If

    Form2.RichTextBox1.Text = sbText.ToString
    Form2.Show()