提取图像并确定图像的尺寸

时间:2014-09-12 08:34:43

标签: php image web-scraping image-size

我有一个PHP代码,可以提取和检索网站中的所有图像。如何修改代码以便显示图像的尺寸(宽度和高度)?

这是php编码:

<?php
$page_title = "MiniCrawler";
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
    <title><?php print($page_title) ?></title>
</head>
<body>

    <?php 
    ini_set('display errors',1);  
    error_reporting(E_ALL|E_STRICT);
    Include simple_html_dom.php.   
    include_once ('simple_html_dom.php');

// Add the url of the site you want to scrape. 
    $target_url = "http://www.alibaba.com/";

// Let simple_html_dom do its magic:
    $html = new simple_html_dom();
    $html->load_file($target_url);

// Loop through the page and find everything in the HTML that begins with 'img'
    foreach($html->find('img') as $link){
        echo $link->src."<br />";
        echo '<img src ="'. $link->src.'"><br />';
    }

    ?>
</body>
</html>

由于

1 个答案:

答案 0 :(得分:1)

首先,您必须检查$link->src字符串是否已在开头具有域名:

<?php

  if(substr($link->src, 0, 4) == "http"){
    // url already complete
    $path = $link->src;
  }else if(substr($link->src, 0, 1) == "/"){
    // path starts absolute
    $path = $target_url . $link->src;
  }else{
    // path starts relative -> http://stackoverflow.com/questions/4444475/transfrom-relative-path-into-absolute-url-using-php
  }

?>

然后:通过getimagesize()函数请求文件维度。

<?php

  list($width, $height, $type, $attr) = getimagesize($path);
  echo '<img src ="'. $link->src.'" width="' . $width . '" height="' . $height . '"><br />';

?>