尝试使用Excel中的VBA从网页中提取一个值

时间:2014-05-11 14:53:54

标签: excel excel-vba web-scraping html vba

我一直试图找到这些信息几天,但我发现的所有示例都只有一小部分代码,我需要全部=)

我想要做的是从主页中提取一个值并将其放入Excel中的单元格中 (然后从同一网站上的另一个页面获取另一个值并放入下一个单元格等等。)

该页面是一个瑞典证券交易所页面,我用作测试页面的页面是"投资者B" (https://www.avanza.se/aktier/om-aktien.html/5247/investor-b

我感兴趣的价值是名为" Senaste" (这是围绕它的页面信息)

<li>
    <span class="XSText">Senast<br/></span>
    <span class="lastPrice SText bold"><span class="pushBox roundCorners3"    title="Senast uppdaterad: 17:29:59">248,60</span></span>
</li>

它之后的价值是248,60!

我有一些编码经验,但不是VBA脚本,在阅读了一些论坛帖子(主要是在这里)之后,我一直在尝试一些自己的例子,但无法获得任何工作。 因为我对VBA很基本,所以我的结构可能不对,所以请基本耐心,这是我的测试,但我得到了#34;运行时错误429&#34; ActiveX组件无法创建对象

我可能完全走错了路线

Private Sub CommandButton1_Click()
Dim ie As Variant
Set ie = CreateObject("InternetExplorer")
ie.navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"
ie.Visible = True
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Application.Wait (Now() + TimeValue("00:00:016")) ' For internal page refresh or loading
Dim doc As Variant 'variable for document or data which need to be extracted out of webpage
Set doc = CreateObject("HTMLDocument")
Set doc = ie.document
Dim dd As Variant
dd = doc.getElementsByClassName("lastPrice SText bold")(0).innerText
MsgBox dd
End Sub

编辑:2014-05-12当前代码测试17:05

按钮命令

下的

Private Sub CommandButton1_Click()
Dim IE As Object
' Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")

' You can uncoment Next line To see form results
IE.Visible = False

' Send the form data To URL As POST binary request
IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"

' Statusbar
Application.StatusBar = "Loading, Please wait..."

' Wait while IE loading...
'Do While IE.Busy
'    Application.Wait DateAdd("s", 1, Now)
'Loop
'this should go from ready-busy-ready
IEWait IE

Application.StatusBar = "Searching for value. Please wait..."
' Dim Document As HTMLDocument
' Set Document = IE.Document
Dim dd As Variant
dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText

MsgBox dd

' Show IE
IE.Visible = True

' Clean up
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing

Application.StatusBar = ""


End Sub

在module1

Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Public Function IEWait(p_ieExp As InternetExplorer)

'this should go from ready-busy-ready
Dim initialReadyState As Integer
initialReadyState = p_ieExp.ReadyState

'wait 250 ms until it's done
Do While p_ieExp.Busy Or p_ieExp.ReadyState <> READYSTATE_COMPLETE
    Sleep 250
Loop

End Function

如前所述,我不知道我是否使用这个最新的加载项获得了正确的结构,而不是在我担心的这种编码中过期。

最好的问候

停止编辑2014-05-12 17:08

2 个答案:

答案 0 :(得分:6)

你很近但有几个小错误。

以下是我如何设置它(经过测试):

Private Sub CommandButton1_Click()
    Dim IE As Object

    ' Create InternetExplorer Object
    Set IE = CreateObject("InternetExplorer.Application")

    ' You can uncoment Next line To see form results
    IE.Visible = False

    ' URL to get data from
    IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/5247/investor-b"

    ' Statusbar
    Application.StatusBar = "Loading, Please wait..."

    ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop

    Application.StatusBar = "Searching for value. Please wait..."

    Dim dd As String
    dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText

    MsgBox dd

    ' Show IE
    IE.Visible = True

    ' Clean up
    Set IE = Nothing

    Application.StatusBar = ""
End Sub

结果:

enter image description here


使用以下参考在Excel 2010中测试:

enter image description here


修改 - 选项B

摆脱可能的&#34;运行时错误&#39; 91&#39;&#34;尝试改变这样的几行:

Dim dd As Variant
Set dd = IE.Document.getElementsByClassName("lastPrice SText bold")

MsgBox dd(0).textContent

修改 - 选项C

另一种获取元素的方法:

Dim tag
Dim tags As Object
Set tags = IE.Document.getElementsByTagName("*")

For Each tag In tags
    If tag.className = "lastPrice SText bold" Then
        MsgBox tag.innerText
        Exit For
    End If
Next tag

(所有三种方法都已在Excel 2010和IE10上测试过)

答案 1 :(得分:0)

我只是想添加我目前正在运行的代码,如果人们遇到同样的问题,那么目前运行得非常好。这是为了将两个值放入专用单元格中。

Private Sub CommandButton10_Click()
Dim IE As Object
    Dim dd As Variant
    ' Create InternetExplorer Object
    Set IE = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
    IE.Visible = False

    ' Send the form data To URL As POST binary request
    IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/52476/alk-abell-b"

    Application.StatusBar = "Loading, Please wait..."
    IEWait IE

    Application.StatusBar = "Searching for value. Please wait..."
    dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText

    Range("Y2").Value = dd
    
    IE.Navigate "https://www.avanza.se/aktier/om-aktien.html/52380/alm--brand"

    Application.StatusBar = "Loading, Please wait..."
    IEWait IE

    Application.StatusBar = "Searching for value. Please wait..."
    dd = IE.Document.getElementsByClassName("lastPrice SText bold")(0).innerText

    Range("Y3").Value = dd

' Clean up
    Set IE = Nothing
    Set objElement = Nothing
    Set objCollection = Nothing

    Application.StatusBar = ""
End Sub

如果想要更多数据,只需复制以。开头的部分即可 IE.Navigate“https://www.pagewhereyourdatayouwanttoextractis.com 然后停下来 范围(“Y2”)。值= dd

如果您要从中提取数据的页面具有与上述类似的结构,则基于此。

希望这可以帮助那些人。

最好的问候