Question

我正在尝试执行PDFSharp库的sample program。我已经在项目中提供了参考。

问题：该代码执行没有任何错误，我发现方法ExportJpegImage()被调用了7次（即pdf中的图像数）。 但是当我尝试打开程序编写的图像（在.jpeg中时，Windows无法打开它们。

static void Main(string[] args)
    {
        Console.WriteLine("Starting PDF Sharp sample program...");

        const string filename = @"D:/Test/test.pdf";

        PdfDocument document = PdfReader.Open(filename);

        int imageCount = 0;
        // Iterate pages
        foreach (PdfPage page in document.Pages)
        {
            // Get resources dictionary
            PdfDictionary resources = page.Elements.GetDictionary("/Resources");
            if (resources != null)
            {
                // Get external objects dictionary
                PdfDictionary xObjects = resources.Elements.GetDictionary("/XObject");
                if (xObjects != null)
                {
                    ICollection<pdfitem> items = xObjects.Elements.Values;
                    // Iterate references to external objects
                    foreach (PdfItem item in items)
                    {
                        PdfReference reference = item as PdfReference;
                        if (reference != null)
                        {
                            PdfDictionary xObject = reference.Value as PdfDictionary;
                            // Is external object an image?
                            if (xObject != null && xObject.Elements.GetString("/Subtype") == "/Image")
                            {
                                ExportJpegImage(xObject, ref imageCount);
                            }
                        }
                    }
                }
            }
        }

    }

 static void ExportJpegImage(PdfDictionary image, ref int count)
    {
        // Fortunately JPEG has native support in PDF and exporting an image is just writing the stream to a file.
        byte[] stream = image.Stream.Value;
        FileStream fs = new FileStream(String.Format(@"D:\Test\Image{0}.jpeg", count++), FileMode.Create, FileAccess.Write);
        BinaryWriter bw = new BinaryWriter(fs);
        bw.Write(stream);
        bw.Close();
    }

Answer 1

请阅读example之前的注释：

注意：该摘要显示了 如何从PDF文件导出JPEG图像 。 PDFsharp无法将PDF页面转换为JPEG文件。此示例 不处理非JPEG图像 。（尚未）不能处理经过平面编码的JPEG图像。

PDF中的非JPEG图像有几种不同的格式。 这个简单的示例不支持这些代码，并且需要几个小时的编码，但这是读者的练习。

因此，您要从PDF提取的图像很可能根本没有作为JPEG图像嵌入（至少在没有进一步过滤（如放气）的情况下）。要将数据转换为普通图像查看器可以处理的数据，您很可能必须花费几个小时的编码。

PdfSharp C＃库提取的图像破碎

1 个答案: