HTMLDocument对象缺少Visual Studio中的方法

时间:2017-07-29 18:27:38

标签: vb.net vba visual-studio

我对编程比较陌生,并且已经在VBA中编写了一个web-scraper,我试图在Visual Studio上的VB.Net中重新创建它。我正在使用我在vba中使用的相同对象(mshtml.HTMLDocument),但由于某些原因,在visual studio中它似乎缺少.getElementsByClassName方法,这对我的程序来说是必不可少的。我只是不明白为什么在Visual Studio上的VB.net中会丢失它,如果我在VBA中创建时使用相同的参考库和相同的对象。

我做错了吗?

VBA Intellisense & Reference Library

Visual Studio VB.Net Intellisense, Reference Library, & Error

1 个答案:

答案 0 :(得分:0)

System.Windows.Forms.HtmlDocument(在VB.NET中)不是mshtml.HtmlDocument(在VBA中)。如果没有看到相关的代码,我无法确定你是否还没有看到前者。

您可以编写自己的方法来获取具有特定类名的元素,而不是通过额外的步骤来获取后者,例如。

Public Class Form1

    Dim wb As WebBrowser

    Function GetElementsHavingClassName(doc As HtmlDocument, className As String) As List(Of HtmlElement)
        Dim elems As New List(Of HtmlElement)

        For Each elem As HtmlElement In doc.All
            Dim classes = elem.GetAttribute("className")
            If classes.Split(" "c).Any(Function(c) c = className) Then
                elems.Add(elem)
            End If
        Next

        Return elems

    End Function

    Sub ExtractElements(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
        Dim wb = DirectCast(sender, WebBrowser)
        Dim flintstones = GetElementsHavingClassName(wb.Document, "flintstone")

        If flintstones.Count > 0 Then
            For Each fs In flintstones
                ' do something with the element
                TextBox1.AppendText(fs.InnerText & vbCrLf)
            Next
        Else
            TextBox1.Text = "Not found."
        End If

    End Sub

    Sub DoStuff()
        If wb Is Nothing Then
            wb = New WebBrowser
        End If

        RemoveHandler wb.DocumentCompleted, AddressOf ExtractElements ' don't leave any old ones lying around
        AddHandler wb.DocumentCompleted, AddressOf ExtractElements

        Dim loc = "file:///c:\temp\somehtml.html"

        Try
            wb.Navigate(loc)
        Catch ex As Exception
            'TODO: handle the problem gracefully.
            MsgBox(ex.Message)
        End Try

    End Sub

    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        DoStuff()

    End Sub

    Private Sub Form1_FormClosing(sender As Object, e As FormClosingEventArgs) Handles MyBase.FormClosing
        If wb IsNot Nothing Then
            RemoveHandler wb.DocumentCompleted, AddressOf ExtractElements
            wb.Dispose()
        End If

    End Sub

End Class

其中,给定HTML

<!DOCTYPE html>
<html>
<head><title></title></head>
<body>
<div class="fred flintstone">Fred</div>
<div class="wilma flintstone">Wilma</div>
<div class="not-a-flintstone">Barney</div>
</body>
</html>

输出

  

佛瑞德
  威尔玛