提取" for"来自html网页中标签元素的属性

时间:2016-09-07 15:43:03

标签: html vb.net visual-studio getattribute

我有一些代码可以在点击该网页的某些部分时分析网页的各种属性。挑选出的元素之一是被点击元素的ID。

有时候没有ID,而是点击的元素是使用" for"的标签。属性以引用ID。在这些情况下,我想拿起" for"属性值。

我试图按照以下方式执行此操作:

txtID.Text = TryCast(myHTMLDocument, HtmlDocument).GetElementFromPoint(lastMousePos).GetAttribute("id")
If txtID.Text = "" Then
  txtID.Text = TryCast(myHTMLDocument, HtmlDocument).GetElementFromPoint(lastMousePos).GetAttribute("for")
End If

由于某种原因,.GetAttribute("for")始终返回空白。我错误地引用了这个属性 - 或者是其他的东西。

下面的HTML示例:

<div class="question legal-owner active">

<a class="help-trigger help-trigger-layout">
    <span class="help-text-icon"></span>
</a>


<div class="quote-help quote-help-layout">
    <a class="quote-help-close-container">
        <div class="quote-help-close"></div>
    </a>
    <h3>Car ownership</h3>

    <p>
        We need to know whether the car belongs to you. If you don’t own the car but you’re the registered keeper, you should answer ‘No’ 
        (the owner of the car and the registered keeper can be different people).
    </p>

</div>

<span class="editor-label question-layout">
    <label for="OwningAndUsingCarPanel_LegalOwner">Are you (or will you be) the legal owner of this car?</label>
</span> 
    <ul class="question-layout yesno-radio-list">
        <li>
            <input name="OwningAndUsingCarPanel.LegalOwner" id="OwningAndUsingCarPanel_LegalOwner_true" type="radio" value="True">
            <label for="OwningAndUsingCarPanel_LegalOwner_true">
                <span>Yes</span>
            </label>
        </li>
        <li>
            <input name="OwningAndUsingCarPanel.LegalOwner" id="OwningAndUsingCarPanel_LegalOwner_false" type="radio" value="False">
            <label for="OwningAndUsingCarPanel_LegalOwner_false">
                <span>No</span>
            </label>
        </li>
    </ul>
<span class="editor-validation">
    <span class="field-validation-valid" id="OwningAndUsingCarPanel_LegalOwner_validationMessage"></span>
</span>
</div>

1 个答案:

答案 0 :(得分:0)

我已经通过创建一个名为getUnknown的自己的函数来解决这个问题,以搜索标记中的属性。这适用于任何值为双引号的属性。该函数有2个参数,第一个是一个字符串,它应该包含带有属性和值的元素标记,第二个是要为其提取值的属性。

Private Function getUnknown(myText As String, myAttr As String)
    Dim myResult As String = ""
    Dim myStart As Integer = 0
    Dim myLen As Integer = 0
    'remove any spaces around the "=" sign
    Dim myCleanText As String = Regex.Replace(myText, "\s+([=])\s+|\s+([=])|([=])\s+", "=")
    'add =" to the attribute to avoid finding non-attributes when using IndexOf function
    Dim myFullAttr As String = myAttr.Trim().ToLower + "="""

    Try
        myStart = myCleanText.ToLower().IndexOf(myFullAttr)
        If myStart = -1 Then
            myResult = "Nothing Found"
        Else
            myStart = myStart + myFullAttr.Length
            myLen = myCleanText.IndexOf("""", myStart) - myStart
            myResult = myCleanText.Substring(myStart, myLen)
        End If
    Catch ex As Exception
        myResult = "Nothing Found"
    End Try

    Return myResult

End Function

在我原来的问题中,我使用了以下内容

Dim myElement As String = _
 TryCast(myHTMLDocument, HtmlDocument).GetElementFromPoint(lastMousePos).OuterHtml.ToString

txtID.Text = getUnknown(myElement, "for")
相关问题