结合curl multi和strpos()在网页上查找文本

时间:2015-01-27 16:09:26

标签: php curl curl-multi

EDITED

我正在尝试使用curl multi检查网站的响应,并另外检查每个卷曲响应的部分文本。我已经将数据分组到一个数组中但是我无法弄清楚我是否正在使用正确/最有效的方法来使用' post'来运行strpos()函数。文本。

$data = array(array());

$data[0]['url']  = 'http://www.google.com';
$data[0]['post'] = 'google text';

$data[1]['url']  = 'http://www.yahoo.com';
$data[1]['post'] = 'yahoo text';

$r = multiRequest($data);

echo '<pre>';
print_r($r);

这是我的功能:

function multiRequest($data, $options = array()) {

  // array of curl handles
  $curly = array();

  // data to be returned
  $result = array();

  // multi handle
  $mh = curl_multi_init();

  // loop through $data and create curl handles
  // then add them to the multi-handle
  foreach ($data as $id => $d) {

    $curly[$id] = curl_init();

    $url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
    curl_setopt($curly[$id], CURLOPT_URL,            $url);
    curl_setopt($curly[$id], CURLOPT_HEADER,         0);
    curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curly[$id], CURLOPT_USERAGENT,      'Chrome: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2');

    // extra options?
    if (!empty($options)) {
      curl_setopt_array($curly[$id], $options);
    }

    curl_multi_add_handle($mh, $curly[$id]);
  }

  // execute the handles
  $running = null;
  do {
    curl_multi_exec($mh, $running);
  } while($running > 0);

  // get content and remove handles
  foreach($curly as $id => $c) {
    $result[$id][] = curl_getinfo($c, CURLINFO_EFFECTIVE_URL);
    $result[$id][] = curl_getinfo($c, CURLINFO_HTTP_CODE);
    $result[$id][] = curl_getinfo($c, CURLINFO_CONTENT_TYPE);
    $url = curl_getinfo($c, CURLINFO_EFFECTIVE_URL);
    // loop data again
    foreach ($data as $id => $d){
        if($url==$d['url']){ // only check current url data
            $text = curl_exec($c);
        $result[$id][] = strpos($text, $d['post']); 
        }
    }
    curl_multi_remove_handle($mh, $c);
  }

  // all done
  curl_multi_close($mh);

  return $result;
}

有人可以就我的解决方案是否合适提出建议吗?是否有更有效/更好的方法来执行我的strpos()检查?

由于

1 个答案:

答案 0 :(得分:0)

使用数组,以便您可以将所有网址,字符串和卷曲句柄关联在一起:

$stuff = array(
   0 => array('url' => 'google', 'text' => 'googletext', 'curl' => null)
   1 => array('url' => 'yahoo', 'text' => 'yahootext', 'curl' => null)
   etc..
);

foreach($stuff as $key => $info) {
   $stuff[$key]['curl'] =  curl_init($stuff[$key]['url']);
   curl_multi_add_handle($mh, $stuff[$key]['curl']);   
}

然后在处理结果时执行类似的循环。