从xmlhttp.responseText提取数据到Excel

时间:2018-09-28 05:28:15

标签: excel vba excel-vba web-scraping xmlhttprequest

我正在编写一个宏以从网站中获取一些数据。我在接下来的步骤中感到震惊。我想参与数据并在excel中打印。我在下面提供了用于打印数据的数据结构和代码。我正在尝试各种方法三天。没有运气!

"data": [{
 "annualisedVolatility": "37.35",
 "bestBuy": "6.04",.
  etc                               
 }]

VBA代码:我在URL中添加了数据;因此您可以在测试时删除变量

Sub getNSEFutData()
    ExpiryDate = Range("TSys_ExpiryDate")
    xRow = 4
    Do Until wksNSE50.Cells(xRow, 1) = ""
    scripID = wksNSE50.Cells(xRow, 1).Value
    scripID = Replace(scripID, "&", "%26")
    'URL = "http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=" & scripID & "&instrument=FUTSTK&expiry=" & ExpiryDate
    URL = "http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=ADANIPORTS&instrument=FUTSTK&expiry=25OCT2018"
    Set xmlhttp = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    xmlhttp.Open "GET", URL, False
    xmlhttp.setRequestHeader "Content-Type", "text/JSON"
    xmlhttp.Send
    sNo = InStr(1, xmlhttp.responseText, "id=" & Chr(34) & "responseDiv")
    Debug.Print xmlhttp.responseText
    Dim jsonStr As String
    jsonStr = Trim(Mid(xmlhttp.responseText, sNo, InStr(sNo, xmlhttp.responseText, "/div>") - sNo))
    Debug.Print jsonStr

    xRow = xRow + 1
    Loop

End Sub

2 个答案:

答案 0 :(得分:3)

这使用JSONConvertor.bas和整个JSON字符串(不仅来自Data)。您只需调整当前的InStr方法即可确定JSON提取的起点和终点。我正在从牢房里读。 “数据”是一个集合。第一项是字典,其中包含您要查找的值。您只需循环字典的键,如下所示。例如,通过使用Cells添加一个单元格引用(将rowCounter和columnCounter变量放置到位置),将它们写到一行就足够容易了。您将使外部循环中的rowCounter增量每次都可写入新行。

VBA:

Public Sub GetInfoFromSheet()
    Dim jsonStr As String, json As Object, key As Variant, columnCounter As Long, rowCounter As Long
    jsonStr = ThisWorkbook.Worksheets("Sheet1").[a1]
    Set json = JsonConverter.ParseJson(jsonStr)("data")(1)
    rowCounter = rowCounter + 1
    For Each key In json
        columnCounter = columnCounter + 1
        ThisWorkbook.Worksheets("Sheet2").cells(rowCounter, columnCounter) = key & " : " & json(key)
    Next
End Sub

JSON树视图:

这是遍历路径的视图:


JSON字符串:

这是我正在使用的JSON:

{
  "valid": "true",
  "tradedDate": "28SEP2018",
  "eqLink": "/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=ADANIPORTS",
  "data": [
    {
      "annualisedVolatility": "37.35",
      "bestBuy": "8.22",
      "totalSellQuantity": "6,65,000",
      "vwap": "338.01",
      "clientWisePositionLimits": "7807220",
      "optionType": "-",
      "highPrice": "342.35",
      "dailyVolatility": "1.95",
      "bestSell": "9.22",
      "marketLot": "2500",
      "sellQuantity5": "7,500",
      "marketWidePositionLimits": "156144401",
      "sellQuantity4": "2,500",
      "sellQuantity3": "2,500",
      "sellQuantity2": "2,500",
      "underlying": "ADANIPORTS",
      "sellQuantity1": "5,000",
      "pChange": "0.55",
      "premiumTurnover": "-",
      "totalBuyQuantity": "4,95,000",
      "turnoverinRsLakhs": "15,227.35",
      "changeinOpenInterest": "3,35,000",
      "strikePrice": "-",
      "openInterest": "98,67,500",
      "buyPrice2": "338.10",
      "buyPrice1": "338.25",
      "openPrice": "339.80",
      "prevClose": "336.55",
      "expiryDate": "25OCT2018",
      "lowPrice": "333.75",
      "buyPrice4": "338.00",
      "buyPrice3": "338.05",
      "buyPrice5": "337.95",
      "numberOfContractsTraded": "1,802",
      "instrumentType": "FUTSTK",
      "sellPrice1": "338.50",
      "sellPrice2": "338.55",
      "sellPrice3": "338.70",
      "sellPrice4": "338.75",
      "sellPrice5": "338.80",
      "change": "1.85",
      "pchangeinOpenInterest": "3.51",
      "ltp": "8.82",
      "impliedVolatility": "-",
      "underlyingValue": "336.20",
      "buyQuantity4": "7,500",
      "buyQuantity3": "2,500",
      "buyQuantity2": "5,000",
      "buyQuantity1": "2,500",
      "buyQuantity5": "5,000",
      "settlementPrice": "336.55",
      "closePrice": "0.00",
      "lastPrice": "338.40"
    }
  ],
  "companyName": "Adani Ports and Special Economic Zone Limited",
  "lastUpdateTime": "28-SEP-2018 11:41:23",
  "isinCode": null,
  "ocLink": "/marketinfo/sym_map/symbolMapping.jsp?symbol=ADANIPORTS&instrument=-&date=-&segmentLink=17&symbolCount=2"
}

注意:

如果您只想提取数据部分而不是较长的字符串,请考虑调整当前的Instr方法以检索以下有效的JSON字符串:

[{"annualisedVolatility":"37.35","bestBuy":"6.01","totalSellQuantity":"6,30,000","vwap":"337.78","clientWisePositionLimits":"7807220","optionType":"-","highPrice":"342.35","dailyVolatility":"1.95","bestSell":"7.80","marketLot":"2500","sellQuantity5":"2,500","marketWidePositionLimits":"156144401","sellQuantity4":"7,500","sellQuantity3":"2,500","sellQuantity2":"2,500","underlying":"ADANIPORTS","sellQuantity1":"2,500","pChange":"0.65","premiumTurnover":"-","totalBuyQuantity":"6,20,000","turnoverinRsLakhs":"24,370.83","changeinOpenInterest":"5,15,000","strikePrice":"-","openInterest":"1,00,47,500","buyPrice2":"338.30","buyPrice1":"338.35","openPrice":"339.80","prevClose":"336.55","expiryDate":"25OCT2018","lowPrice":"333.75","buyPrice4":"338.10","buyPrice3":"338.20","buyPrice5":"338.05","numberOfContractsTraded":"2,886","instrumentType":"FUTSTK","sellPrice1":"338.80","sellPrice2":"338.90","sellPrice3":"338.95","sellPrice4":"339.00","sellPrice5":"339.05","change":"2.20","pchangeinOpenInterest":"5.40","ltp":"7.60","impliedVolatility":"-","underlyingValue":"336.85","buyQuantity4":"7,500","buyQuantity3":"2,500","buyQuantity2":"5,000","buyQuantity1":"2,500","buyQuantity5":"5,000","settlementPrice":"336.55","closePrice":"0.00","lastPrice":"338.75"}]

答案 1 :(得分:3)

我不清楚您是在首先提取数据时还是在提取所需的部分时遇到困难。

在您提供的URL中,没有看到像您的示例这样的数据。 (短语bestBuy不存在,我看到的数字是“ 337 ...”,但不是“ 37 ...”。)

无论如何,如果您的问题是从Excel中提供的URL获取数据,则最简单的方法是使用内置功能而不是重写现有功能。


“获取外部数据”

在Excel中转​​到Data标签,然后单击From Web。出现提示时,输入URL并按 Enter 。结果会根据要加载的页面而有所不同,但是在这种情况下,我单击了左窗格中的Table 0,然后单击了 Load

img

...然后Excel将数据整齐地加载到表中,可以在需要时自动更新或刷新该表,或者只需单击即可手动更新/刷新。

img


getHTTP

如果出于某种原因,您需要使用VBA进行此操作,尽管这样做会以一种或另一种方式花费一些额外的工作,但也可以这样做。

要从任何站点加载JSONHTML,我使用此功能:

Public Function getHTTP(ByVal sReq As String) As String
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", sReq, False
        .Send
        getHTTP = StrConv(.responseBody, vbUnicode)
    End With
End Function

用法示例:

由于我不确定您从何处获取JSON数据,因此以下示例使用site status显示堆栈溢出的Stack Exchange API

Sub GetSiteInfo()
    Const url = "https://api.stackexchange.com/2.2/info?site=stackoverflow"
    Dim json As String

    json = getHTTP(url) 'get JSON response

    If InStr(json, """error_id""") > 0 Or json = "" Then 'check for error
        MsgBox "There was a problem." & vbLf & vbLf & json, vbCritical
        Exit Sub
    End If

    json = Mid(json, InStr(json, "[{""") + 3) 'tidy response with string functions
    json = Left(json, InStr(json, "}],") - 1)
    json = Replace(Replace(Replace(json, Chr(34), ""), ",", vbNewLine), "_", " ")
    json = Replace(StrConv(Replace(json, ":", " :" & vbTab & vbTab _
            & vbTab), vbProperCase), " Per ", "/")

    MsgBox json, vbInformation, "Site Statistics" 'display response
End Sub

请注意,如何使用基本的string functions管理类似这样的简单响应的提取。


使用基本的String函数提取值

再举一个例子,使用问题顶部的JSON数据,如果字符串位于名为json的变量中,并且您想要以一种方式(以下几种方式)提取bestBuy的值可能的方法)如下:

Sub jsonExtract_Demo()
    Const json = "aaaaaaa""bestBuy"": ""6.04"",."           'for demo

    Dim pStart As Long, pStop As Long, bestBuy As Single
    Dim prefix As String, suffix As String
    prefix = "bestBuy"": """                                'equivalent to:   "bestBuy": "
    suffix = """"                                           'equivalent to a single   "

    pStart = InStr(json, prefix) + Len(prefix)              'find beginning of value
    pStop = InStr(pStart, json, suffix)                     'find end of value
    bestBuy = CSng(Mid(json, pStart, pStop - pStart))       'extract & convert value
    MsgBox "The value for 'bestBuy` is : " & bestBuy, vbInformation
End Sub

WEBSERVICE工作表功能

最后我想指出的是一个经常被忽略的Excel工作表函数,该函数对大多数纯文本响应(例如此JSON example)都适用:

输入工作表单元格

=WEBSERVICE("https://raw.githubusercontent.com/bahamas10/css-color-names/master/css-color-names.json")

...,您将立即获得原始文本结果,并可以根据需要使用工作表功能进行操作。对于XML响应,可以将WEBSERVICEFILTERXML结合使用XPath提取特定的数据,这对于基本的抓取需求非常方便。

上面包含的链接中的更多信息。