是否可以在不下载整个图像的情况下检测URL上图像的尺寸?

时间:2011-02-13 11:13:19

标签: html image parsing html-parsing image-extraction

给定一个包含新闻文章的HTML页面,我试图从文章中检测相关图像。为此,我正在查看图像的大小(如果它们太小,可能它们是导航元素),但我不想下载每个图像。

有没有办法在不下载完整图像的情况下获取图像的宽度和高度?

2 个答案:

答案 0 :(得分:2)

不知道它是否会帮助您加快申请速度,但可以做到。查看这两篇文章:

http://www.anttikupila.com/flash/getting-jpg-dimensions-with-as3-without-loading-the-entire-file/用于JPEG

PNG的

http://www.herrodius.com/blog/265

它们都适用于ActionScript,但该原则当然适用于其他语言。

我使用C#制作了一个样本。它不是最漂亮的代码,它只适用于JPEG,但也可以很容易地扩展到PNG:

var request = (HttpWebRequest) WebRequest.Create("http://unawe.org/joomla/images/materials/posters/galaxy/galaxy_poster2_very_large.jpg");
using (WebResponse response = request.GetResponse())
using (Stream responseStream = response.GetResponseStream())
{
    int r;
    bool found = false;
    while (!found && (r = responseStream.ReadByte()) != -1)
    {
        if (r != 255) continue;

        int marker = responseStream.ReadByte();

        // App specific
        if (marker >= 224 && marker <= 239)
        {
            int payloadLengthHi = responseStream.ReadByte();
            int payloadLengthLo = responseStream.ReadByte();
            int payloadLength = (payloadLengthHi << 8) + payloadLengthLo;
            for (int i = 0; i < payloadLength - 2; i++)
                responseStream.ReadByte();
        }
        // SOF0
        else if (marker == 192)
        {
            // Length of payload - don't care
            responseStream.ReadByte();
            responseStream.ReadByte();

            // Bit depth - don't care
            responseStream.ReadByte();

            int widthHi = responseStream.ReadByte();
            int widthLo = responseStream.ReadByte();
            int width = (widthHi << 8) + widthLo;

            int heightHi = responseStream.ReadByte();
            int heightLo = responseStream.ReadByte();
            int height = (heightHi << 8) + heightLo;

            Console.WriteLine(width + "x" + height);
            found = true;
        }
    }
}

编辑: 我不是Python的专家,但是这篇文章似乎只是在做一个Python lib(最后一个例子):http://effbot.org/zone/pil-image-size.htm

答案 1 :(得分:1)

不,这是不可能的。但是,您可以从img代码中获取信息,但不能从背景中获取信息。