Question

我写了一些代码来从Blogger博客导入内容。一旦我下载了所有HTML内容，我就会浏览图像标签并下载相应的图像。在很多情况下，System.Drawing.Bitmap.FromStream抛出ArgumentException。我正在下载的URL看起来不错，它可以按预期提供图像（这里是其中一个问题图像的URL：http://4.bp.blogspot.com/_tSWCyhtOc38/SgIPcctWRZI/AAAAAAAAAGg/2LLnVPxsogI/s1600-h/IMG_3590.jpg）。

    private static System.Drawing.Image DownloadImage(string source)
    {
        System.Drawing.Image image = null;

        // used to fetch content
        var client = new HttpClient();

        // used to store image data
        var memoryStream = new MemoryStream();

        try
        {
            // fetch the image
            var imageStream = client.GetStreamAsync(source).Result;

            // instantiate a system.drawing.image from the data
            image = System.Drawing.Bitmap.FromStream(imageStream, false, false);

            // save the image data to a memory stream
            image.Save(memoryStream, image.RawFormat);
        }
        catch (IOException exception)
        {
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (ArgumentException exception)
        {
            // sometimes, an image will link to a web page, resulting in this exception
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (AggregateException exception)
        {
            // sometimes, an image src will throw a 404
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        finally
        {
            // clean up our disposable resources
            client.Dispose();
            memoryStream.Dispose();
        }

        return image;
    }

知道为什么会抛出ArgumentException吗？

编辑：我想到它可能是代理问题，所以我在web.config中添加了以下内容：

<system.net>
  <defaultProxy enabled="true" useDefaultCredentials="true">
    <proxy usesystemdefault="True" />
  </defaultProxy>
</system.net>

然而，添加该部分没有任何区别。

编辑：从EF数据库初始化程序的上下文调用此代码。这是一个堆栈跟踪：

Web.dll！Web.Models.Initializer.DownloadImage（string source）第234行C＃ Web.dll！Web.Models.Initializer.DownloadImagesForPost.AnonymousMethod__5（HtmlAgilityPack.HtmlNode标记）第126行+ 0x8字节C＃ [外部代码] Web.dll！Web.Models.Initializer.DownloadImagesForPost（Web.Models.Post post）第119行+ 0x34字节C＃ Web.dll！Web.Models.Initializer.Seed（Web.Models.FarmersMarketContext context）第320行+ 0xb字节C＃ [外部代码] App_Web_l2h4tcej.dll！ASP._Page_Views_Home_Index_cshtml.Execute（）第28行+ 0x15字节C＃ [外部代码]

Answer 1

好的，我发现了这个问题。事实证明，在某些情况下，Blogger会引用一个呈现图像的HTML页面，而不是引用图像本身。因此，在这种情况下的响应不是有效的图像。在尝试保存图像数据之前，我添加了代码来检查响应标头，这解决了问题。为了帮助遇到此问题的其他人，以下是更新后的代码：

    private static System.Drawing.Image DownloadImage(string source)
    {
        System.Drawing.Image image = null;

        // used to fetch content
        var client = new HttpClient();

        // used to store image data
        var memoryStream = new MemoryStream();

        try
        {
            // Blogger tacks on a -h to an image Url to link to an HTML page instead
            if (source.Contains("-h/"))
                source = source.Replace("-h/", "/");

            // fetch the image
            var response = client.GetAsync(source).Result;
            response.EnsureSuccessStatusCode();

            var contentType = response.Content.Headers.ContentType.MediaType;

            if (!contentType.StartsWith("image/"))
            {
                Debug.WriteLine(contentType);
                throw new ArgumentException("Specified source did not return an image");
            }

            var imageStream = response.Content.ReadAsStreamAsync().Result;

            // instantiate a system.drawing.image from the data
            image = System.Drawing.Bitmap.FromStream(imageStream, true, true);

            // save the image data to a memory stream
            image.Save(memoryStream, image.RawFormat);
        }
        catch (HttpRequestException exception)
        {
            // sometimes, we'll get a 404 or other unexpected response
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (IOException exception)
        {
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (ArgumentException exception)
        {
            // sometimes, an image will link to a web page, resulting in this exception
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        finally
        {
            // clean up our disposable resources
            client.Dispose();
            memoryStream.Dispose();
        }

        return image;
    }

Answer 2

您正在处理另一个问题，我认为您是偶然修复的。不幸的是，GDI +异常并不是很好，他们通常不会告诉你真正的问题是什么。

Image.FromStream（）实现中一个不起眼的花絮是GDI +在从流中加载位图时使用流的Seek（）方法。但是，只有在流允许搜索时才能正常工作，其CanSeek属性必须返回true。对于网络流通常不是这种情况，没有提供足够的缓冲来允许任意搜索。

HttpClient.GetStreamAsync（）存在哪个问题，它的MSDN Library文档说：

此方法不缓冲流

虽然您编写的工作版本使用HttpContent.ReadAsStreamAsync（），但它的MSDN Library文档说：

在将所有内容写为字节数组
后，返回的Task对象将完成

因此，您的第一个版本不起作用，因为流的CanSeek属性为false，第二个版本有效，因为整个响应被读入一个允许搜索的字节数组。通用的解决方案是首先将流插入MemoryStream。

从流实例化位图时出现ArgumentException

2 个答案: