Question

有人可以帮我解决这个问题。如何从pdf中提取一些页面并将它们作为字节数组或Stream返回，而不使用物理文件作为输出。

以下是使用filestream执行此操作的方法：

public static void ExtractPages(string sourcePdfPath, string outputPdfPath, int      startPage, int endPage)
{

    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage = null;

   try
   {

    reader = new PdfReader(sourcePdfPath);
    sourceDocument = new Document(reader.GetPageSizeWithRotation(startPage));
    pdfCopyProvider = new PdfCopy(sourceDocument, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
    sourceDocument.Open();

        for(int i = startPage; i <= endPage; i++)
        {
             importedPage = pdfCopyProvider.GetImportedPage(reader, i);
            pdfCopyProvider.AddPage(importedPage);
        }
        sourceDocument.Close();
        reader.Close();
    }
    catch(Exception ex)
    {
        throw ex;
    }

}

我需要类似的东西：

public static byte[] ExtractPages(string sourcePdfPath, int startPage, int endPage)
{
....
    return byte[];
}

Answer 1

将new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create)替换为MemoryStream并将其返回。

有些事情（未经测试，但应该有效）：

public static byte[] ExtractPages(string sourcePdfPath, int startPage, int endPage)
{
    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage = null;
    MemoryStream target = new MemoryStream();

    reader = new PdfReader(sourcePdfPath);
    sourceDocument = new Document(reader.GetPageSizeWithRotation(startPage));
    pdfCopyProvider = new PdfCopy(sourceDocument, target);
    sourceDocument.Open();

    for(int i = startPage; i <= endPage; i++)
    {
        importedPage = pdfCopyProvider.GetImportedPage(reader, i);
        pdfCopyProvider.AddPage(importedPage);
    }
    sourceDocument.Close();
    reader.Close();

    return target.ToArray();
}

提取内存中的页面iTextSharp

1 个答案: