尝试将URL Json导入Excel时出错

时间:2019-04-05 15:41:13

标签: json excel vba web-scraping winhttprequest

我试图通过WinHttpRequest从以下网址导入JSON格式的信息: https://bet.hkjc.com/football/getJSON.aspx?jsontype=odds_allodds.aspx&matchid=default

Sub test()

Dim xmlhttp As Object
Dim strUrl As String: strUrl = "https://bet.hkjc.com/football/getJSON.aspx?jsontype=odds_allodds.aspx&matchid=default"
Dim objRequest As Object

Set objRequest = CreateObject("WinHttp.WinHttpRequest.5.1")

    With objRequest
        .Open "GET", strUrl, False
        .send
    End With

    Debug.Print objRequest.responseText

End Sub

但是,它只显示了与Url类似的内容,但显示了很多乱码。

我想知道如何解决这个问题。 如果我使用其他网址,代码可以正常工作。

1 个答案:

答案 0 :(得分:0)

XHR:

我认为该页面已采取了防止漫游器的措施,如果怀疑您是漫游器,则会引发挑战,需要运行javascript。如果该命令成功运行,则会向XHR请求发出标头中的质询信息,并且如果您使用的是浏览器,则会导致您的内容正确更新以显示期望值。

我第一次运行GET请求时,得到了预期的json响应,此后,我得到了以下信息:

<HTML>
<head>
<script>
Challenge=649275;
ChallengeId=473313563;
GenericErrorMessageCookies="Cookies must be enabled in order to view this page.";
</script>
<script>
function test(var1)
{
    var var_str=""+Challenge;
    var var_arr=var_str.split("");
    var LastDig=var_arr.reverse()[0];
    var minDig=var_arr.sort()[0];
    var subvar1 = (2 * (var_arr[2]))+(var_arr[1]*1);
    var subvar2 = (2 * var_arr[2])+var_arr[1];
    var my_pow=Math.pow(((var_arr[0]*1)+2),var_arr[1]);
    var x=(var1*3+subvar1)*1;
    var y=Math.cos(Math.PI*subvar2);
    var answer=x*y;
    answer-=my_pow*1;
    answer+=(minDig*1)-(LastDig*1);
    answer=answer+subvar2;
    return answer;
}
</script>
<script>
client = null;
if (window.XMLHttpRequest)
{
    var client=new XMLHttpRequest();
}
else
{
    if (window.ActiveXObject)
    {
        client = new ActiveXObject('MSXML2.XMLHTTP.3.0');
    };
}
if (!((!!client)&&(!!Math.pow)&&(!!Math.cos)&&(!![].sort)&&(!![].reverse)))
{
    document.write("Not all needed JavaScript methods are supported.<BR>");

}
else
{
    client.onreadystatechange  = function()
    {
        if(client.readyState  == 4)
        {
            var MyCookie=client.getResponseHeader("X-AA-Cookie-Value");
            if ((MyCookie == null) || (MyCookie==""))
            {
                document.write(client.responseText);
                return;
            }
            
            var cookieName = MyCookie.split('=')[0];
            if (document.cookie.indexOf(cookieName)==-1)
            {
                document.write(GenericErrorMessageCookies);
                return;
            }
            window.location.reload(true);
        }
    };
    y=test(Challenge);
    client.open("POST",window.location,true);
    client.setRequestHeader('X-AA-Challenge-ID', ChallengeId);
    client.setRequestHeader('X-AA-Challenge-Result',y);
    client.setRequestHeader('X-AA-Challenge',Challenge);
    client.setRequestHeader('Content-Type' , 'text/plain');
    client.send();
}
</script>
</head>
<body>

您是否模仿javascript的功能并将其作为新的XHR传递(不确定)(不确定)。

您还可以尝试使用浏览器自动化功能,例如IE通过Microsoft Internet Controls或Chrome / FF等通过Selenium Basic,以查看是否让javascript在页面上运行可以解决此问题。


处理挑战:(WIP)

我开始考虑尝试解决这个问题。目前,我一直在获取json响应,因此尚未完全测试底部。我希望有些分钟 *我们关心吗?错误的余地,仅仅是因为Math.PI给出了3.141592653589793,而Application.PI给出了{ {1}}

3.14159265358979

基于浏览器的解决方案:

使用Microsoft Internet Controls进行标准IE自动化会导致“另存为/打开对话框”提示。

使用硒可以避免出现此提示,并从pre元素中获取数据。使用硒可以使您受益于隐式等待,该等待使页面可以完成发出的所有挑战。您可以使用明确的等待条件来增加等待时间。

Option Explicit
Public Sub GetInfo()
    Dim json As Object, s As String, re As Object, ws As Worksheet
    Dim pattern1 As String, pattern2 As String, challenge As Long, challengeId As Long
    Const URL As String = "https://bet.hkjc.com/football/getJSON.aspx?jsontype=odds_allodds.aspx&matchid=default"
    pattern1 = "Challenge=(\d+);"
    pattern2 = "ChallengeId=(\d+);"
    Set re = CreateObject("vbscript.regexp")
    Set ws = ThisWorkbook.Worksheets("Sheet1")

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        s = .responseText
        On Error Resume Next
        Set json = JsonConverter.ParseJson(s)
        On Error GoTo 0
        If Not json Is Nothing Then
            Debug.Print "No challenge issued"
            Debug.Print .responseText
        Else
            On Error GoTo errhand
            challenge = GetId(re, s, pattern1)
            If challenge = 999 Then Exit Sub     'should really use more unlikely value.
            challengeId = GetId(re, s, pattern2)
            .Open "POST", URL, False
            .setRequestHeader "X-AA-Challenge-ID", challengeId
            .setRequestHeader "X-AA-Challenge-Result", CLng(GetAnswer(challenge))
            .setRequestHeader "X-AA-Challenge", challenge
            .setRequestHeader "Content-Type", "text/plain"
            .send ""
            Debug.Print .Status, .responseText
            If .Status = 200 Then
                .Open "GET", URL, False
                .setRequestHeader "User-Agent", "Mozilla/5.0"
                .send
                s = .responseText
                Debug.Print s
            End If
        End If
    End With
    Exit Sub
errhand:
    Debug.Print Err.Number, Err.Description
End Sub

Public Function GetId(ByVal re As Object, ByVal s As String, ByVal pattern As String) As Long
    With re
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .pattern = pattern
        If .TEST(s) Then
            GetId = .Execute(s)(0).SubMatches(0)
        Else
            GetId = 999                          '<probably should use a more unlikely number here!
        End If
    End With
End Function

Public Function GetAnswer(ByVal challenge As Long) As String 'var1  'challenge
    Dim var_str As String, var_arr() As Long, LastDig As Long, minDig As Long
    Dim i As Long

    var_str = Chr$(34) & challenge & Chr$(34)
    ReDim var_arr(0 To Len(var_str) - 3)

    For i = 2 To Len(var_str) - 1
        var_arr(i - 2) = CLng(Mid$(var_str, i, 1))
    Next i

    LastDig = var_arr(UBound(var_arr))
    minDig = Application.Min(var_arr)

    Dim my_pow As Long, x As Long, y As Long, answer As Variant
    Dim subvar1 As Long, subvar2 As String

    subvar1 = 2 * Application.Small(var_arr, 3) + Application.Small(var_arr, 2)
    subvar2 = CStr(2 * Application.Small(var_arr, 3)) & CStr(Application.Small(var_arr, 2))
    my_pow = (minDig + 2) ^ Application.Small(var_arr, 2)
    x = challenge * 3 + (subvar1 * 1)
    y = Evaluate("=COS(PI()* " & CLng(subvar2) & ")")
    answer = x * y
    answer = answer - my_pow
    answer = answer + minDig - LastDig
    answer = CStr(answer) & subvar2
    GetAnswer = answer
End Function

参考:

请注意,我正在使用json parser。从该链接添加.bas之后,您需要转到VBE>工具>引用>添加对Microsoft脚本运行时的引用。


1 RubberDuckVBA团队12的一些观点