我想下载谷歌搜索大图片。以下是我的代码,适用于从谷歌搜索下载小图片。
<?php
include_once('simple_html_dom.php');
set_time_limit(0);
$fp = fopen('csv/search.csv','r') or die("can't open file");
$csv_data = array();
while($csv_line = fgetcsv($fp)) {
for ($i = 0, $j = count($csv_line); $i < $j; $i++) {
$imgname = $csv_line[$i];
$search_query = $csv_line[$i];
$search_query = urlencode(trim($search_query));
$html=file_get_html('http://images.google.com/images?as_q='. $search_query .'&hl=en&imgtbs=z&btnG=Search+Images&as_epq=&as_oq=&as_eq=&imgtype=&imgsz=m&imgw=&imgh=&imgar=&as_filetype=&imgc=&as_sitesearch=&as_rights=&safe=images&as_st=y');
$image_container = $html->find('div#rcnt', 0);
$images = $html->find('img');
$image_count = 1; //Enter the amount of images to be shown
$i = 0;
foreach($images as $image){
$srcimg = $image->src;
if($i == $image_count) break;
$i++;
$randname = $imgname.".jpg";
$randname = "Images/".$randname;
file_put_contents("$randname", file_get_contents($srcimg));
}
}
}
?>
有什么想法吗?
答案 0 :(得分:2)
这对我有用。 simple_html_dom.php不会做这个伎俩,因为大图像&#39;在DOM中每个缩略图附近的JSON片段内。
<?php
$search_query = "Some Keyword"; //change this
$search_query = urlencode( $search_query );
$googleRealURL = "https://www.google.com/search?hl=en&biw=1360&bih=652&tbs=isz%3Alt%2Cislt%3Asvga%2Citp%3Aphoto&tbm=isch&sa=1&q=".$search_query."&oq=".$search_query."&gs_l=psy-ab.12...0.0.0.10572.0.0.0.0.0.0.0.0..0.0....0...1..64.psy-ab..0.0.0.wFdNGGlUIRk";
// Call Google with CURL + User-Agent
$ch = curl_init($googleRealURL);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux i686; rv:20.0) Gecko/20121230 Firefox/20.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$google = curl_exec($ch);
$array_imghtml = explode("\"ou\":\"", $google); //the big url is inside JSON snippet "ou":"big url"
foreach($array_imghtml as $key => $value){
if ($key > 0) {
$array_imghtml_2 = explode("\",\"", $value);
$array_imgurl[] = $array_imghtml_2[0];
}
}
var_dump($array_imgurl); //array contains the urls for the big images
die();
?>
答案 1 :(得分:0)
我认为,不是抓取页面,而是谷歌自定义搜索API
有关详细信息,请访问以下网址: https://developers.google.com/custom-search/json-api/v1/overview