使用VBA从网页到Excel的数据提取

时间:2014-01-29 12:50:20

标签: vba data-extraction

我试图从网页上拉一张桌子,到目前为止我成功从网页上拉了一张桌子,不幸的是我在桌子的每一行都有一些链接,当我从网页上拉出桌子时,我得到没有链接的输出,只是文字,有没有办法我们可以使用VBA包括超链接从网页拉表。

这是我的代码:

Sub TableExample()
Dim IE As Object
Dim doc As Object
Dim strURL As String

strURL = "HERE I USED MY URL"
' replace with URL of your choice

Set IE = CreateObject("InternetExplorer.Application")
With IE
'.Visible = True

.Navigate strURL
Do Until .readyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc

.Quit
End With
End Sub

Sub GetAllTables(doc As Object)

' get all the tables from a webpage document, doc, and put them in a new worksheet

Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long

Set ws = Worksheets.Add

For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl

ws.Cells.ClearFormats

End Sub

1 个答案:

答案 0 :(得分:1)

执行“rng.Value = cl.outerText”时,只能获得文本。如果你需要拥有所有链接和其他html - 请使用innerHTML属性。

将“rng.Value = cl.outerText”替换为“rng.Value = cl.innerHTML”。这将返回带有链接的整个html;)