如何从EXCEL文件中的源代码获取带有VBA的META关键字内容

时间:2016-06-11 11:25:48

标签: html excel-vba keyword meta-tags vba

我必须将数百个网站的源代码下载到Excel文件(例如,工作表1中的单元格(1,1)),然后在let' s中提取META标记关键字的内容。说细胞(1,2)。

要下载,请在VBA中使用以下代码:

Dim htm As Object
Set htm = CreateObject("HTMLfile")
URL = "https://www.insolvenzbekanntmachungen.de/cgi-bin/bl_aufruf.pl?PHPSESSID=8ecbeb942c887974468b9010531fc7ab&datei=gerichte/nw/agkoeln/16/0071_IN00181_16/2016_06_10__11_53_26_Anordnung_Sicherungsmassnahmen.htm"
With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", URL, False
    .send
    htm.body.innerHTML = .responseText
    Cells(1, 1) = .responseText
End With

我在这个网站上发现了以下代码,但不幸的是,我无法对其进行调整以解决我的问题:

Sub GetData()
Dim ie As New InternetExplorer
Dim str As String
Dim wk As Worksheet
Dim webpage As New HTMLDocument
Dim item As HTMLHtmlElement

Set wk = Worksheets(1)
str = "https://www.insolvenzbekanntmachungen.de/cgi-bin/bl_aufruf.pl?PHPSESSID=8ecbeb942c887974468b9010531fc7ab&datei=gerichte/nw/agkoeln/16/0071_IN00181_16/2016_06_10__11_53_26_Anordnung_Sicherungsmassnahmen.htm"
ie.Visible = True

ie.navigate str

Do
    DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE


'Find the proper meta element --------------
Const META_TAG As String = "META"
Const META_NAME As String = "keywords"
Dim Doc As HTMLDocument
Dim metaElements As Object
Dim element As Object
Dim kwd As String


Set Doc = ie.Document
Set metaElements = Doc.all.tags(META_TAG)

For Each element In metaElements
    If element.Name = META_NAME Then
        kwd = element.Content
    End If
Next

MsgBox kwd

End Sub

我想我必须修改这一行,但不知道如何:

Set Doc = ie.Document

你能帮帮我吗?

1 个答案:

答案 0 :(得分:0)

将WebrowserControl嵌入Excel电子表格或用户表格中 How to add a Webrowser to Excel
设置对HTML库的引用
How to add VBA References – Internet Controls, HTML Object Library

从这篇帖子中获取Greg Truby的代码Webbroswer Control

您可以访问文档对象模型(DOM)。这将暴露大多数HTMLElements属性和事件

Option Explicit

Private WithEvents htmDocument As HTMLDocument
Private WithEvents MyButton As HTMLButtonElement

Private Function MyButton_onclick() As Boolean
    MsgBox "Sombody Click MyButton on WebBrowser1"
End Function

Private Sub WebBrowser1_NavigateComplete2(ByVal pDisp As Object, URL As Variant)
    Dim aTags As Hyperlinks

    Do Until .ReadyState = READYSTATE_COMPLETE
        DoEvents
    Loop

    Set MyButton = htmDocument.getElementById("MyButtonID")
    Set htmDocument = WebBrowser1.Document
    Set aTags = htmDocument.getElementsByTagName("a")
End Sub

Google Web Api,HTA,(MDN){https://developer.mozilla.org/en-US/docs/Web/API}如果您遇到困难,请尝试将Javascript代码重构为vbscript。它' S