在我的代码中,当我将Regexp
内容分配给变量但不使用html
时,我的url path
工作正常。我得到空阵列。
<?php
$productmfgno = "154637401";
$url = "http://www.pandorasoem.com/search#q=".$productmfgno;
$ch1= curl_init();
curl_setopt ($ch1, CURLOPT_URL, $url );
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch1,CURLOPT_VERBOSE,1);
curl_setopt($ch1, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0)');
curl_setopt ($ch1, CURLOPT_REFERER,'http://www.google.com'); //just a fake referer
curl_setopt($ch1, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch1,CURLOPT_POST,0);
curl_setopt($ch1, CURLOPT_FOLLOWLOCATION, 20);
$htmlContent= curl_exec($ch1);
curl_close($ch1);
/* It works when I assign this html content to $htmlContent variable but not working with cURL url
$htmlContent = '<div class="findify-navigation-header findify-clearfix"> <div class="findify-pagination findify-push-right"></div> <div class="findify-header">Showing 2 results for <span class="findify-query">"154637401"</span>. <span id="findify-didyoumean"></span></div> </div>';
*/
preg_match_all('/<div.*class=\"findify\-header\".*?>(.*?)<span.*class=\"findify-query\">.*?<\/div>/Us', $htmlContent, $count);
print_r($count);
预期结果 - Showing 2 results for
所以我可以获取结果计数。
答案 0 :(得分:1)
事情是,页面上没有结果,您正在请求。加载页面后,通过ajax执行实际搜索。
搜索的Ajax端点,您可能正在寻找,在javascript代码中返回结果(而不是json)。那是:
UPD :由于格式不同,您需要一个新的正则表达式。这样的事情会做:
preg_match_all('/["\']?totalHits["\']?\s*:\s*(\d+)/gi', $htmlContent, $count);