我尝试合并两个大的PDF文件而不将它们完全加载到内存中。
我尝试使用PdfMerger,并且在没有PdfMerger的情况下尝试使用这种代码:
using(var writer = new PdfWriter(new FileStream(@"C:\Test\OutBig.pdf",FileMode.OpenOrCreate)))
using (var outputDocument = new PdfDocument(writer)) {
using (var inputDoc = new PdfDocument(new PdfReader((@"C:\Test\InBig.pdf")))) {
for (int i = 1; i <= inputDoc.GetNumberOfPages(); i++) {
var newp = outputDocument.AddNewPage();
var canvas = new PdfCanvas(newp);
var origPage = inputDoc.GetPage(i);
var copy = origPage.CopyAsFormXObject(outputDocument);
canvas.AddXObject(copy, 0, 0);
copy.Flush();
origPage = null;
canvas.Release();
newp.Flush();
writer.Flush();
canvas = null;
newp = null;
}
}
代码正在运行,但每个页面都加载到内存中并保持加载状态,因此我在内存中加载了超过1GB的内容。
你知道如何合并2个pdfs文件而不用itext7将它们加载到内存中吗?
此致
帕特里斯
答案 0 :(得分:1)
使用PdfDocument doc1 = new PdfDocument(new PdfReader(IN1));
int numOfPages = doc1.getNumberOfPages();
doc1.close();
PdfDocument outDoc = new PdfDocument(new PdfWriter(OUT));
int numOfPagesPerDocumentOpen = 10;
for (int i = 1; i <= numOfPages; ) {
int firstPageToCopy = i;
int lastPageToCopy = Math.min(i + numOfPagesPerDocumentOpen - 1, numOfPages);
doc1 = new PdfDocument(new PdfReader(IN1));
doc1.copyPagesTo(firstPageToCopy, lastPageToCopy, outDoc);
// Flush last lastPageToCopy - firstPageToCopy + 1 pages
for (int j = 0; j <= lastPageToCopy - firstPageToCopy; j++) {
outDoc.getPage(outDoc.getNumberOfPages() - j).flush(true);
}
doc1.close();
i = lastPageToCopy + 1;
}
outDoc.close();
复制大型文档时,有几种方法可以降低内存消耗。其中之一是利用按需读取对象的事实。因此,您实际上可以通过多次打开和关闭源文档,将页面从源文档复制到目标文档。
以下是Java中的代码,它几乎只能通过将方法名称替换为大写来转换为C#。
The parameter baudrate can be one of the standard values: 50, 75, 110,
134, 150, 200, 300, 600, 1200, 1800, 2400, 4800, 9600, 19200, 38400, 57600,
115200. These are well supported on all platforms.
Standard values above 115200, such as: 230400, 460800, 500000, 576000,
921600, 1000000, 1152000, 1500000, 2000000, 2500000, 3000000, 3500000, 4000000
also work on many platforms and devices.
Non-standard values are also supported on some platforms (GNU/Linux, MAC
OSX >= Tiger, Windows). Though, even on these platforms some serial ports may
reject non-standard values.
答案 1 :(得分:0)
我现在已经尝试了一些组件(Aspose
、ITextSharp
和 Telerik
)并且 Telerik 似乎已经破解了它。
我关注了these steps,但内存仍然很低。
示例代码
var files = Directory.GetFiles(bundlePath);
using (PdfStreamWriter fileWriter = new PdfStreamWriter(File.OpenWrite(outputFile)))
{
// Iterate through the files you would like to merge
foreach (string documentName in files)
{
// Open each of the files
using (PdfFileSource fileToMerge = new PdfFileSource(File.OpenRead(documentName)))
{
// Iterate through the pages of the current document
foreach (PdfPageSource pageToMerge in fileToMerge.Pages)
{
// Append the current page to the fileWriter, which holds the result FileStream
fileWriter.WritePage(pageToMerge);
}
}
}
}