使用可变URL(URL扩展名)在Excel中进行Web抓取

时间:2016-06-17 17:20:53

标签: excel vba excel-vba web-scraping

我在excel中对VBA和VBA相当陌生,我一直试图找出如何根据一个单元格值(" Guid")有条件地刮取网页数据,并且还没有真正找到方法进步功能 - 使其动态化。截至目前,我只能获取它来检索一个特定单元格的数据,并在另一个指定的单元格中打印。我相信我只是缺少某种循环变量函数? (除了可能有更正确的编写代码的方法)。

Sub ie_open()
  Dim wb As Workbook
  Dim ws As Worksheet
  Dim TxtRng As Range
  Dim Guid As Range
  Dim ie As Object
  Dim URL As String

  URL = "https://url.com/userpage="
  Set wb = ActiveWorkbook
  Set ws = wb.Sheets("Detail Report - Individuals")
  Set Guid = ws.Range("E2")
  Set TxtRng = ws.Range("F2")
  Set ie = CreateObject("INTERNETEXPLORER.APPLICATION")

  ie.NAVIGATE (URL + Guid)
  ie.Visible = True

  While ie.ReadyState <> 4
     DoEvents
  Wend

  TxtRng = ie.document.getelementbyid("lbl_Location").innertext

End Sub

提前谢谢。

1 个答案:

答案 0 :(得分:0)

打开对HTML元素的引用(转到工具 - 引用。您还应该打开对Microsoft Internet控件的引用,这样您就可以将IE声明为InternetExplorer对象而不仅仅是一个对象,但它是没必要),然后你可以循环遍历每个元素,如

ConcurrentModificationException

修改:忘记在循环中增加Sub ie_open() Dim wb As Workbook Dim ws As Worksheet Dim TxtRng As Range Dim Guid As Range Dim ie As Object Dim URL As String 'ADDED THIS Dim sl as Ihtmlelement Dim r as long = 1 URL = "https://url.com/userpage=" Set wb = ActiveWorkbook Set ws = wb.Sheets("Detail Report - Individuals") Set Guid = ws.Range("E2") Set TxtRng = ws.Range("F2") Set ie = CreateObject("INTERNETEXPLORER.APPLICATION") ie.NAVIGATE (URL + Guid) ie.Visible = True While ie.ReadyState <> 4 DoEvents Wend For each sl in ie.document.all ws.cells(r, 1).value = sl.innertext r = r + 1 Next 'TxtRng = ie.document.getelementbyid("lbl_Location").innertext End Sub 变量,我认为在初始化循环时它应该是r而不是IE.Document.All