Excel VBA使用相同的类名刮取li标签

时间:2017-12-03 17:21:25

标签: html excel vba excel-vba web-scraping

我正在尝试抓取共享相同类名的li标签 HTML鳕鱼看起来像这样:



    Sub GetData()

    Dim objIE As InternetExplorer
    Dim itemEle As Object
    Dim data As String
    Dim y As Integer

    Set objIE = New InternetExplorer
    objIE.Visible = True

    objIE.navigate "https://www.bhphotovideo.com/c/product/1312545-REG/fujifilm_16550643_instax_mini_9_instant.html"
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    For Each itemEle In objIE.document.getElementsByClassName("top-section-list")
    data = itemEle.getElementsByTagName("li")(0).innerText

    Next
    Range("A1").Value = data
End Sub
&#13;
&#13;
&#13;

这是我的VBA代码,它只能抓取第一个列出的项目,而不是其余的。

return this.http.post(url, {questionId:questionId}).map(
  res => <Question>res.json()[0],
  err => console.log(err)      
);

它只写入单元格A1:&#34;示例文本#1&#34; 我怎样才能将所有标签写入单元格A1? 期望的结果将是单元格A1: 示例文本#1 示例文本#2 示例文本#3 示例文本#4 示例文本#5

谢谢!

3 个答案:

答案 0 :(得分:1)

最简单的方法可能是:

Dim data As String
Dim elem as object

data = ""
For Each elem In html.getElementsByClassName("top-section-list")(0).getElementsByTagName("li")
    data = data & " " & elem.innerText
Next elem
[A1] = data

输出:

 sample text# 1 sample text# 2 sample text# 3 sample text# 4 sample text# 5 

现在,试一试:

Sub GetData()
    Dim IE As New InternetExplorer, html As HTMLDocument
    Dim elem As Object, data As String

    With IE
        .Visible = True
        .navigate "https://www.bhphotovideo.com/c/product/1312545-REG/fujifilm_16550643_instax_mini_9_instant.html"
        Do While .readyState <> READYSTATE_COMPLETE: Loop
        Set html = .document
    End With

    data = ""

    For Each elem In html.getElementsByClassName("top-section-list")(0).getElementsByTagName("li")
        data = data & " " & elem.innerText
    Next elem

    Range("A1").Value = data

    IE.Quit
End Sub

我尝试优化您的代码以使其外观略显愉快。

参考添加到库:

Microsoft Internet Controls
Microsoft HTML Object Library

还有一件事:正如Jeeped指出的那样,如果你得到一个尾随空格,那么你可以尝试类似Range("A1").Value = Trim(Application.WorksheetFunction.Clean(data))

的东西

答案 1 :(得分:1)

使用.querySelectorAll并循环返回的nodeList

要使用的CSS选择器是

ul.top-section-list li

VBA代码:

Option Explicit
Public Sub GetData()
    Dim objIE As InternetExplorer, nodeList As Object, currentItem As Long, outputString As String
    Set objIE = New InternetExplorer
    objIE.Visible = True
    objIE.navigate "https://www.bhphotovideo.com/c/product/1312545-REG/fujifilm_16550643_instax_mini_9_instant.html"

    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    Set nodeList = objIE.document.querySelectorAll("ul.top-section-list li")
    With ActiveSheet                             '<== use actual sheet name
        For currentItem = 0 To nodeList.Length - 1
            outputString = outputString & Chr$(32) & nodeList.item(currentItem).innerText
        Next currentItem
        .Cells(1, 1) = Trim$(outputString)
    End With
    'ObjIE.Quit
End Sub

CSS查询:

CSS selector query Try it

答案 2 :(得分:0)

使用vbLF分隔符将字符串值连接到 data 字符串var中。

dim data as string, i as long
data = vbNullString

with objIE.document.getElementsByClassName("top-section-list")(0)
    For i = 0 to .getElementsByTagName("li").Length - 1
        data = data & vblLF & .getElementsByTagName("li")(i).innerText
    Next
end with
Range("A1").Value = Mid(data, 2)  'write data INTO A1, not the other way around