我是VBA和网站的新手。
我正在尝试从下面的网站中提取数据(表格)以用于VBA代码。
我尝试创建Internet Explorer浏览器:
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "http://www.bkam.ma/Marches/Principaux-indicateurs/Marche-obligataire/Marche-des-bons-de-tresor/Marche-secondaire/Taux-de-reference-des-bons-du-tresor?date=13%2F02%2F2019&block=e1d6b9bbf87f86f8ba53e8518e882982#address-c3367fcefc5f524397748201aee5dab8-e1d6b9bbf87f86f8ba53e8518e882982"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
然后,我尝试使用ID或标记名属性来获取数据
Set val = appIE.document.getElementById()
我不知道如何获取表中的元素,因为它们没有我可以使用的ID或标记名。如您在源代码中的这段代码中所见
</span>
</div>
</th>
</tr>
</thead>
<tbody>
<tr>
<td>18/03/2019</td>
<td><span class="number">20,05</sapn> <span class="symbol"></span></td>
<td><span class="number">2,250</sapn> <span class="symbol">%</span></td>
<td>13/02/2019</td>
</tr>
此代码段显示了我要提取的表的第一行。
答案 0 :(得分:1)
您可以避免使用浏览器,并使用xmlhttp获取页面内容,然后按其类选择表元素(没有要使用的id,并且class是ID之后的下一个最快的选择器),然后循环写行和列出来。
Option Explicit
Public Sub GetTable()
Dim html As MSHTML.HTMLDocument, hTable As Object, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New MSHTML.HTMLDocument '< VBE > Tools > References > Microsoft Scripting Runtime
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "http://www.bkam.ma/Marches/Principaux-indicateurs/Marche-obligataire/Marche-des-bons-de-tresor/Marche-secondaire/Taux-de-reference-des-bons-du-tresor?date=13%2F02%2F2019&block=e1d6b9bbf87f86f8ba53e8518e882982#address-c3367fcefc5f524397748201aee5dab8-e1d6b9bbf87f86f8ba53e8518e882982", False
.send
html.body.innerHTML = .responseText
End With
Set hTable = html.querySelector(".dynamic_contents_ref_12")
Dim td As Object, tr As Object, th As Object, r As Long, c As Long
For Each tr In hTable.getElementsByTagName("tr")
r = r + 1: c = 1
For Each th In tr.getElementsByTagName("th")
ws.Cells(r, c) = th.innerText
c = c + 1
Next
For Each td In tr.getElementsByTagName("td")
ws.Cells(r, c) = td.innerText
c = c + 1
Next
Next
End Sub
答案 1 :(得分:0)
Set HTMLTable = appIE.document.getElementsByClassName("dynamic_contents_ref_12")(0)
这将获得具有类名dynamic_contents_ref_12
的HTML元素数组,并返回其第一个元素。
这将为您提供第一行:
Set TBody = HTMLTable.Children(1) 'The <tbody> tag is the second child
Set Row1 = TBody.Children(0) 'The first <tr> inside the <tbody> tag
对于每一行,在括号中放置一个不同的索引。
现在Row1
中的HTML看起来
<tr>
<td>
18/03/2019
</td>
<td>
<span class="number">
20,05
<span class="symbol"></span>
</span>
</td>
<td>
<span class="number">
2,250
<span class="symbol">%</span>
</span>
</td>
<td>
13/02/2019
</td>
</tr>
(每个<td>
是该行中的一个单元格。)
要在单元格中获取文本,我们可以使用.innerText
方法,该方法返回一个字符串:
CellA1 = Row1.Children(0).innerText ' = "05/04/2019"
CellB1 = Row1.Children(1).innerText ' = "43,85 "
使用For Each
循环,我们可以从HTML表中获取所有单元格并将其复制到工作表中-假设您要从单元格 A1 开始。
'Table Headers
ActiveSheet.Range("A1").Value = "Date d'échéance"
ActiveSheet.Range("B1").Value = "Transaction"
ActiveSheet.Range("C1").Value = "Taux moyen pondéré"
ActiveSheet.Range("D1").Value = "Date de la valeur"
Set HTMLTable = appIE.document.getElementsByClassName("dynamic_contents_ref_12")(0)
Set TBody = HTMLTable.Children(1)
RowIndex = 2
For Each Row in TBody.Children
ActiveSheet.Cells(RowIndex, 1).Value = Row.Children(0).innerText
ActiveSheet.Cells(RowIndex, 2).Value = Row.Children(1).innerText
ActiveSheet.Cells(RowIndex, 3).Value = Row.Children(2).innerText
ActiveSheet.Cells(RowIndex, 4).Value = Row.Children(3).innerText
RowIndex = RowIndex + 1
Next