如何打开URL并保存页面上的所有图像

时间:2015-01-05 00:52:00

标签: php curl

如果我想打开一个URL,例如:http://www.google.com/information.php,然后保存information.php文件中显示的所有图像,并且只能在div标签之间保存图像,我该怎么办? 34; displayimg"

如果你帮帮我,会很高兴!所有我知道我可以使用cURL,但不知道如何在theese请求后制作它。

谢谢!

function getimg($url) {         
    $headers[] = 'Accept: image/gif, image/x-bitmap, image/jpeg, image/pjpeg';              
    $headers[] = 'Connection: Keep-Alive';         
    $headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-8';         
    $user_agent = 'php';         
    $process = curl_init($url);         
    curl_setopt($process, CURLOPT_HTTPHEADER, $headers);         
    curl_setopt($process, CURLOPT_HEADER, 0);         
    curl_setopt($process, CURLOPT_USERAGENT, $useragent);         
    curl_setopt($process, CURLOPT_TIMEOUT, 30);         
    curl_setopt($process, CURLOPT_RETURNTRANSFER, 1);         
    curl_setopt($process, CURLOPT_FOLLOWLOCATION, 1);         
    $return = curl_exec($process);         
    curl_close($process);         
    return $return;     
} 

$imgurl = 'http://www.foodtest.ru/images/big_img/sausage_3.jpg'; 
$imagename= basename($imgurl);
if(file_exists('./tmp/'.$imagename)){continue;} 
$image = getimg($imgurl); 
file_put_contents('tmp/'.$imagename,$image); 

编辑:

我现在使用此代码,但如何将链接存储在数组中,以便能够将图像下载到我的服务器?

    require_once('simplehtmldom/simple_html_dom.php');
require_once('url_to_absolute.php');

$url = 'http://www.electrictoolbox.com/php-get-meta-tags-html-file/';

$html = file_get_html($url);
foreach($html->find('img') as $element) {
    echo url_to_absolute($url, $element->src), "\n";
}

2 个答案:

答案 0 :(得分:0)

尝试使用"简单的HTML DOM Parser"图书馆(http://simplehtmldom.sourceforge.net/)。

您的代码可能类似于:

<?php
include('simple_html_dom.php');
$URL = "http://www.google.com/information.php";
$dumpDir = "dumpDir/";

//Get the page as a whole    
$html = file_get_html($URL);

//Find all the images located within div
foreach($html->find("div#displayimage img") as $img){
   $src = $img->src;

   //Get filename
   $filename = substr($img->src, strrpos($img->src, "/")+1);

   //Quick fix for relative file paths
   if (strtolower(substr($src, 0, 5)) != 'http:' && strtolower(substr($src, 0, 6)) != 'https:') $src = $URL.$src;

   // Save the file
   file_put_contents($dumpDir.$filename, file_get_contents($src));
}
?>

答案 1 :(得分:-1)

如果你想获取一个包含所有内容(图片,js,css等等)的页面,我建议你使用 wget

$your_url = "http://www.google.com/information.php";
$your_output_dir = "/whatever/dir/you/might/use/";
$you_logs = "/your/log/dir/wget.log";
$cmd = "wget -p --convert-links $your_url -P $your_output_dir -o $you_logs";
exec($cmd);

请查看wget手册页以获取帮助,或google搜索 wget examples