强制服务返回json而不是默认的html

时间:2016-01-10 00:25:45

标签: php curl

我希望能够在php中编写与此curl命令相同的代码:

curl -F out=json --form-string 'content=<!DOCTYPE html><html><head><title>check it</title></head><body></body></html>' http://validator.w3.org/nu/ 

此curl命令按预期返回json。 也许我在这里遗漏了他们的文档: https://github.com/validator/validator/wiki/Service:-Input:-POST-bodyhttps://github.com/validator/validator/wiki/Service%3A-HTTP-interface

我现在遇到的问题是Web服务返回html而不是json。 虽然我将标题接受设置为json但它不起作用。我还试图设置accept和Content-Type,但是这会触发来自Web服务的错误,说明无效输入。以下是我需要您帮助的代码:

$html = "<!DOCTYPE html><html><head><title>test</title></head><body></body></html>";
$endPoint = "http://validator.w3.org/nu/";
$timeout = 5000;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $endPoint);
curl_setopt($ch, CURLOPT_TIMEOUT_MS, $timeout);
curl_setopt($ch,CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);  
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
//curl_setopt($ch, CURLOPT_HTTPHEADER, array('Accept: application/json', 'Content-Type: application/json'));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Accept: application/json'));
curl_setopt($ch,CURLOPT_POSTFIELDS, array('content' => $html, 'out' => 'json'));
$output = curl_exec($ch);
if(curl_errno($ch))
{
    echo curl_error($ch);
}
curl_close($ch);    
error_log(__FILE__. ": " . __LINE__ . ": " . var_export($output, true));
echo $output;

在阅读Ignacio问题后,我正在使用w3c文档页面中的这些信息进行更新:

在他们的文档中他们说html字符串应该在http体中发送,在他们的java库中他们使用它:

String response = null;
String source = "your html here";
HttpResponse<String> uniResponse = Unirest.post("http://localhost:8080/vnu")
    .header("User-Agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36")
    .header("Content-Type", "text/html; charset=UTF-8")
    .queryString("out", "gnu")
    .body(source)
    .asString();
response = uniResponse.getBody();

这可能是你的暗示吗?只是为了让你知道我试过了两个

http://validator.w3.org/nu/?out=json

http://validator.w3.org/nu/

端点(作为上面php脚本中$ endPoint变量的值)。

1 个答案:

答案 0 :(得分:1)

要获得您要查找的结果,您必须将数据发送为multipart/form-data(您可以查看validator page或curl发送的请求以查看数据是否已发送作为multipart/form-data),以此为例:

$url = 'http://validator.w3.org/nu/';
$html = '<!DOCTYPE html><html><head><title>test</title></head><body></body></html>';

$boundary = 'your-boundary'; 

$body = '--' . $boundary . "\r\n";
// set the "out" as "json"
$body .= 'Content-Disposition: form-data; name="out"' . "\r\n" . "\r\n";
$body .= 'json' . "\r\n";
$body .= "--" . $boundary ."\r\n";
// set the "content"
$body .= 'Content-Disposition: form-data; name="content"' . "\r\n" . "\r\n";
$body .= $html . "\r\n";
$body .= "--" . $boundary . "--" . "\r\n" . "\r\n";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: multipart/form-data; boundary='.$boundary));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $body);

echo curl_exec($ch);
curl_close($ch);

然后你会得到这样的东西:

{
    "messages": [{
            "type": "info",
            "message": "The Content-Type was “text/html”. Using the HTML parser."
        }, {
            "type": "info",
            "message": "Using the schema for HTML with SVG 1.1, MathML 3.0, RDFa 1.1, and ITS 2.0 support."
        }],
    "source": {
        "type": "text/html",
        "encoding": "utf-8",
        "code": "<!DOCTYPE html><html><head><title>test</title></head><body></body></html>"
    }
}

希望可以提供帮助。