我使用的网站是www.msci.com。该网站使用了几种可以改变的形式。每种选择组合都会创建一个特定的值矩阵(所需的数据)。我想在我的Excel工作表中创建相同的变量,这样我只需在我的工作表中填写一些代码,在B列中创建一个数据表,该数据表应放在同一工作表的其他地方。
我想我必须在我的宏中加入一些HTML代码来填写这些向下滚动菜单(表单)。我发现了每个基于html ID形式的选择,并尝试将这些选择合并到我的代码中。我认为代码部分工作,但是从日历模板更改日期肯定不起作用。到目前为止我的代码:
Sub getMSCIdata()
Dim mktval As String
Dim curr As String
Dim indlvl As String
Dim calendarinput As String
curr = Range("$B$3")
mktval = Range("$B$2")
indlvl = Range("$B$4")
calendarinput = Range("$B$5")
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.mscibarra.com/webapp/indexperf/pages/IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en" _
, Destination:=Range("$A$10"))
.Name = _
"IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = """templateForm:tableResult0"""
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
答案 0 :(得分:1)
如果网页太复杂而无法使QueryTable提取数据,您可以编写自己的VBA直接从HTML文档中提取数据。
使用工具/参考来添加引用" Microsoft HTML对象库"。
创建一个包含名为WB的大型WebBrowserControl的UserForm。
将此代码添加到表单中:
Private Sub UserForm_Initialize()
WB.navigate ("http://www.mscibarra.com/webapp/indexperf/pages/IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en")
End Sub
Private Sub WB_DocumentComplete(ByVal pDisp As Object, URL As Variant)
Dim tBody As HTMLBody, row As HTMLTableRow
Set tBody = WB.document.getElementById("templateForm:tableResult0:tbody_element")
If tBody Is Nothing Then Stop
For Each row In tBody.rows
Debug.Print CellText(row, 1), CellText(row, 2), CellText(row, 0)
Next
End Sub
' returns an empty string instead of an error
Private Function CellText(row As HTMLTableRow, ByVal cellIndex As Long) As String
Dim Cell As HTMLTableCell
On Error Resume Next
Set Cell = row.Cells.Item(cellIndex)
CellText = Trim(Cell.innerText)
End Function
显示表单。网页应该在几秒钟内加载。 DocumentComplete事件将运行代码以提取和打印Index Code,Last和MSCI Index列。调试窗口应显示:
990300 1,811.383 EAFE
991100 1,785.575 EAFE + CANADA
144097 1,372.105 EAFE ex ISRAEL
991600 2,034.280 EAFE ex UK
991300 1,487.429 EASEA INDEX (EAFE ex JAPAN)
106400 182.491 EMU
106507 169.293 EMU ex GERMANY
990600 399.741 EU
106569 1,076.915 EURO
990500 1,641.595 EUROPE
144115 1,422.575 EUROPE & MIDDLE EAST
106331 189.663 EUROPE ex EMU
995200 1,445.779 EUROPE ex SWITZERLAND
991700 1,854.892 EUROPE ex UK
990900 2,915.545 FAR EAST
113647 1,529.146 G7 INDEX
991200 1,740.757 KOKUSAI INDEX (WORLD ex JP)
990700 6,054.493 NORDIC COUNTRIES
990200 2,113.327 NORTH AMERICA
990800 2,351.421 PACIFIC
991400 1,288.304 PACIFIC ex JAPAN
106570 1,163.646 PAN-EURO
990100 1,721.971 WORLD
701609 1,859.470 WORLD WITH USA GROSS
996200 1,744.360 WORLD ex AUSTRALIA
701610 1,844.715 WORLD ex AUSTRALIA WITH USA GROSS
106330 213.390 WORLD ex EMU
106332 1,745.644 WORLD ex EUROPE
144079 1,637.763 WORLD ex ISRAEL
991500 1,754.637 WORLD ex UK
991000 1,820.809 WORLD ex USA
现在可以直接将这些值放入工作表中。
此技术可以扩展到受HTTP Auth保护的网站和需要登录设置cookie的网站。
您并不仅限于从网页中提取数据。您可以使用VBA填写表单元素,然后单击“提交”按钮。
向表单添加cmdNextPage命令按钮按钮,并添加以下代码:
Private Sub cmdNextPage_Click()
Dim theForm As HTMLFormElement, el As HTMLObjectElement
Set theForm = WB.document.forms("templateForm")
With theForm.elements
.Item("templateForm:_id78").value = "2115" ' set [Market] to "Frontier Markets (FM)"
.Item("templateForm:_id88").value = "Dec 1, 2014" ' set [As of]
End With
theForm.submit
End Sub
测试时,我觉得它不起作用。表单元素会更新,但提交不会执行任何操作。该网页上还有其他内容我无法追踪。您无法使用DocumentComplete来检测页面何时更新,因为它使用AJAX来更新结果表。如果您使用Fiddler来查看线路上的内容,您可以在代码中复制AJAX请求。对不起,我没有时间对此进行进一步的排查。