使用curl加载xsl页面会返回实际的基数错误

时间:2011-05-13 15:01:06

标签: php xpath curl xquery

我们有一个DITA XML应用程序,可以即时生成xhtml,在浏览器中查看它看起来很好。

示例网址为:http://livecontent.jordanpublishing.co.uk/content/en/FAMILY-201103311115/Family_FLJONLINE_FLJ_2009_07_4

Scrrenshot of our application

如果我尝试使用curl加载url,我会收到以下错误:

Error checking function parameter 3 in call transform:transform($fDoc, LiveContent-UI:get_xsl("ui/ui_skin.xsl", ""), LiveContent-UI:get_xsl_params(untyped-value-check[xs:string, $skin], $extra_params)): The actual cardinality for parameter 1 does not match the cardinality declared in the function's signature: LiveContent-Util:browser_from_user_agent($a as xs:string) xs:string. Expected cardinality: exactly one, got 0.

XQuery Stack TraceLiveContent-Util:browser_from_user_agent(xs:string)   161:55
LiveContent-UI:get_xsl_params(xs:string, node())    145:25
LiveContent-UI:get_html(xs:string, xs:string, node(), node())   313:25
LiveContent-Pub:home(xs:string, xs:string, xs:string)   65:17
Java Stack Trace:Class Name Method Name File Name   Line
org.exist.xquery.DynamicCardinalityCheck    eval    DynamicCardinalityCheck.java    80
org.exist.xquery.Atomize    eval    Atomize.java    66
org.exist.xquery.UntypedValueCheck  eval    UntypedValueCheck.java  75
org.exist.xquery.DynamicTypeCheck   eval    DynamicTypeCheck.java   61
org.exist.xquery.FunctionCall   eval    FunctionCall.java   185
org.exist.xquery.AbstractExpression eval    AbstractExpression.java 61
org.exist.xquery.PathExpr   eval    PathExpr.java   241
org.exist.xquery.AttributeConstructor   eval    AttributeConstructor.java   95
org.exist.xquery.ElementConstructor eval    ElementConstructor.java 212
org.exist.xquery.AbstractExpression eval    AbstractExpression.java 61
org.exist.xquery.PathExpr   eval    PathExpr.java   241
org.exist.xquery.ElementConstructor eval    ElementConstructor.java 271
org.exist.xquery.AbstractExpression eval    AbstractExpression.java 61
org.exist.xquery.PathExpr   eval    PathExpr.java   241
org.exist.xquery.DebuggableExpression   eval    DebuggableExpression.java   56
org.exist.xquery.DebuggableExpression   eval    DebuggableExpression.java   63
org.exist.xquery.LetExpr    eval    LetExpr.java    208
org.exist.xquery.BindingExpression  eval    BindingExpression.java  158
org.exist.xquery.AbstractExpression eval    AbstractExpression.java 61
org.exist.xquery.PathExpr   eval    PathExpr.java   241

我完全不知道出了什么问题。

PHP Curl代码如下:

$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL, 'http://onlineservices.letterpart.com/sitemap.xml;jsessionid=1j1agloz5ke7l?id=1j1agloz5ke7l');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec ($ch);
curl_close ($ch);

$xml = new SimpleXMLElement($data);
foreach ($xml->url as $url_list) {
    $url = $url_list->loc;
    echo $url ."<br>"; 
    //file_get_contents($url);




    echo $url ."<br>";   
    $ch = curl_init($url); //load the urls
                    echo $url ."<br>";  
            curl_setopt($ch, CURLOPT_TIMEOUT_m2, 20); //No need to wait for it to load. Execute it and go.
            curl_exec($ch); //Execute
            curl_close($ch); //Close it off 

有人可以帮忙吗?我有点失落,因为这超出了我的技能范围。

谢谢,

修改

有人建议我添加了一个用户代理,所以我添加了以下内容:

curl_setopt($ ch,CURLOPT_USERAGENT,“Mozilla / 4.0(兼容; MSIE 5.01; Windows NT 5.0)”);

我现在在日志中收到以下错误:

[13-May-2011 16:30:14] PHP Notice:  Use of undefined constant CURLOPT_TIMEOUT_m2 - assumed 'CURLOPT_TIMEOUT_m2' in /home/digital1/public_html/dev/sitemap.php on line 43
[13-May-2011 16:30:14] PHP Warning:  curl_setopt() [<a href='function.curl-setopt'>function.curl-setopt</a>]: Invalid curl configuration option in /home/digital1/public_html/dev/sitemap.php on line 43

这里讨论的两条线(43)是:

curl_setopt($ch, CURLOPT_TIMEOUT_m2, 20); //No need to wait for it to load. Execute it and go.

我似乎使用Googlebot代理商有更好的运气:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

就像在我实际上将内容写入屏幕但我的日志仍然显示这些错误。

1 个答案:

答案 0 :(得分:3)

我的直接猜测是您的Java应用程序期望设置User-Agent标头。由于Curl默认情况下不发送User-Agent标头,因此您需要设置一个标头。尝试在选项上方添加此选项以设置CURL_TIMEOUT_m2

curl_setopt($ch, CURLOPT_USERAGENT, "PHP/".PHP_VERSION );

如果由于某种原因它不喜欢User-Agent字符串,您可能想尝试使用实际浏览器中的字符串。

修改

根据你的编辑,这是因为你已经错过了卷曲超时常量。它应该是CURLOPT_TIMEOUT_MS而不是CURLOPT_TIMEOUT_m2