从多个URL获取数据

时间:2014-04-10 18:26:39

标签: vba web-scraping

我想从以下页面获取数据:

视图源:视图源:http://www.difc.ae/debauve-gallais-chocolates-llc

事实上,它是DIFC(迪拜国际金融中心)上市的一个组织。通过打开上述页面的源代码,可以在#372-384行之间找到相关的html标签及其尊重值。

很容易获取单个页面的数据,但是,我在同一个网站上有340个组织(Web URL)来处理。我已经在工作表中安排了URL。我正在尝试将以下代码放在工作中:

Sub GetData() 
    Dim oHtm As Object: Set oHtm = CreateObject("HTMLFile") 
    Dim req As Object: Set req = CreateObject("msxml2.xmlhttp") 
    Dim oRow As Object 
    Dim oCell As Range 
    Dim url As String 
    Dim y As Long, x As Long 

    x = 1 
    For Each oCell In Sheets("sheet1").Range("A2:A340") 
        req.Open "GET", oCell.Offset(, 1).Value, False 
        req.send 
        With oHtm 
            .body.innerhtml = req.responsetext 
            With .getelementsbytagname("table")(1) 
                With Sheets(1) 
                    .Cells(x, 1).Value = oCell.Offset(, -1).Value 
                    .Cells(x, 2).Value = oCell.Value 
                End With 
                y = 3 
                For Each oRow In .Rows 
                    Sheets(1).Cells(x, y).Value = oRow.Cells(1).innertext 
                    y = y + 1 
                Next oRow 
            End With 
        End With 
        x = x + 1 
    Next oCell 


End Sub

这不起作用。我想是因为以下原因:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"

http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd” &GT;

0 个答案:

没有答案