从html字符串

时间:2016-06-10 11:09:00

标签: php json curl

使用cURL,我正在导航到网页。根据cURL脚本的响应,我基本上执行以下操作

$dom = new DOMDocument();
$dom->loadHTML($response);

如果按预期输出$dom,我可以看到该页面的所有html代码。在代码中,有一个特定部分,如下所示

<script id="data" type="application/json">
<![CDATA[
{
    sortColumn: "QuoteNumber",
    quotes: {
        "Data":
        [
            {
                "ID":3235720,
                "Date":"20 May 2016",
                "QuoteNumber":"Q12415",
                "Name":"Some Name",
                "Client":"Some Client",
                "StateName":"Issued",
                "Url":"/Quote/View/3235720"
            }
        ]
    }
}
]]>
</script>

有什么方法可以针对这个特定的代码块?我基本上需要加载JSON并获取Quote的ID。这有可能吗?

1 个答案:

答案 0 :(得分:2)

  1. 您可以使用<script>
  2. 获取getElementById("data")代码
  3. 通过与常量进行比较来检查CDATA节点 XML_CDATA_SECTION_NODE
  4. 使用str_replace()删除CDATA代码。
  5. 使用json_decode将您的内容解析为JSON。
  6. 顺便说一句,CDATA中的内容实际上是格式错误的JSON。它应该如下所述进行纠正:

    <![CDATA[
    {
        "sortColumn" : "QuoteNumber",
        "quotes": {
            "Data":
            [
                {
                    "ID":3235720,
                    "Date":"20 May 2016",
                    "QuoteNumber":"Q12415",
                    "Name":"Some Name",
                    "Client":"Some Client",
                    "StateName":"Issued",
                    "Url":"/Quote/View/3235720"
                }
            ]
        }
    }
    ]]>
    

    我还在底部添加了has_json_error()功能,以便您可以看到一些错误消息。

    $dom = new DOMDocument();
    $dom->loadHTML($response);
    $data = $dom->getElementById("data");
    $content = ''; 
    foreach ($data->childNodes as $child) { 
        if ($child->nodeType == XML_CDATA_SECTION_NODE) {
            $content = $child->textContent;
        }
    }
    $content = str_replace(array("<![CDATA[", "]]>"), '', $content);
    $jsons = json_decode($content);
    
    if(!has_json_error()) {
        echo $jsons->sortColumn;
        echo "<br /><br />";
        print_r($jsons->quotes);
        echo "<br /><br />";
        $data = $jsons->quotes->Data;
        foreach($data as $obj) {
            echo $obj->ID . "<br />";
            echo $obj->Date . "<br />";
            echo $obj->QuoteNumber . "<br />";
            echo $obj->Name . "<br />";
            echo $obj->Client . "<br />";
            echo $obj->StateName . "<br />";
            echo $obj->Url . "<br />";
        }
    }
    
    function has_json_error() {
        if (function_exists ( 'json_last_error' ) && json_last_error() !== JSON_ERROR_NONE) {
            switch (json_last_error()) {
                case JSON_ERROR_DEPTH:
                    echo 'JSON_ERROR: - Maximum stack depth exceeded';
                break;
                case JSON_ERROR_STATE_MISMATCH:
                    echo 'JSON_ERROR: - Underflow or the modes mismatch';
                break;
                case JSON_ERROR_CTRL_CHAR:
                    echo 'JSON_ERROR: - Unexpected control character found';
                break;
                case JSON_ERROR_SYNTAX:
                    echo 'JSON_ERROR: - Syntax error, malformed JSON';
                break;
                case JSON_ERROR_UTF8:
                    echo 'JSON_ERROR: - Malformed UTF-8 characters, possibly incorrectly encoded';
                break;
                default:
                    echo 'JSON_ERROR: - Unknown error: ' . json_last_error();
                break;
            }           
            return true;
        }
        else if (function_exists ( 'json_last_error_msg' ) && json_last_error_msg () !== "No error") {
            echo ("json_last_error_msg, JSON_ERROR:" . json_last_error_msg ());
            return true;
        }
        return false;
    }
    

    上面代码段的结果如下所示:

    QuoteNumber
    
    stdClass Object ( 
        [Data] => Array ( 
            [0] => stdClass Object ( 
                [ID] => 3235720 
                [Date] => 20 May 2016 
                [QuoteNumber] => Q12415 
                [Name] => Some Name 
                [Client] => Some Client 
                [StateName] => Issued 
                [Url] => /Quote/View/3235720 
            ) 
        ) 
    ) 
    
    3235720
    20 May 2016
    Q12415
    Some Name
    Some Client
    Issued
    /Quote/View/3235720