Question

我的两个文档是使用HtmlConverter.convertToDocument创建的，然后合并为一个PDF：

PdfDocument pdf = new PdfDocument(new PdfWriter(pdfDest));

PdfMerger merger = new PdfMerger(pdf, false, true).setCloseSourceDocuments(true);

// Convert
ConverterProperties converterProperties = new ConverterProperties().setBaseUri(resourceFolder);
OutlineHandler outlineHandler = OutlineHandler.createStandardHandler();
converterProperties.setBaseUri(".");
converterProperties.setOutlineHandler(outlineHandler);

第一个文档包含“ HTML Ipsum Presents”书签，第二个文档包含“ Plastic_parts_Basic”和“ Amo”（带有孩子）。

请注意大纲处理程序的使用。合并后，书签似乎变得混乱了。考虑到每个文档的OutlineHandler按照相同的模式创建目标是合理的：

OutlineHandler addOutline(ITagWorker tagWorker, IElementNode element, ProcessorContext context) {
    String tagName = element.name();
    if (null != tagWorker && hasTagPriorityMapping(tagName) && context.getPdfDocument() != null) {
        int level = (int) getTagPriorityMapping(tagName);
        if (null == currentOutline) {
            currentOutline = context.getPdfDocument().getOutlines(false);
        }
        PdfOutline parent = currentOutline;
        while (!levelsInProcess.isEmpty() && level <= levelsInProcess.getFirst()) {
            parent = parent.getParent();
            levelsInProcess.pop();
        }
        String content = ((JsoupElementNode) element).text();
        if (content.isEmpty()) {
            content = getUniqueID(tagName);
        }
        PdfOutline outline = parent.addOutline(content);
        String destination = DESTINATION_PREFIX + getUniqueID(DESTINATION_PREFIX);
        outline.addDestination(PdfDestination.makeDestination(new PdfString(destination)));

        destinationsInProcess.push(destination);

        levelsInProcess.push(level);
        currentOutline = outline;
    }
    return this;
}

单击书签中的“标题级别2”将指向最后合并的文档（“ Amo”）中的第二个标题：

我试图扩展OutlineHandler类，但是我需要更改的方法（getUniqueID）是私有的，因此在超类中不可见。

是否可以通过html创建的多个文档中获得唯一的目的地？

源文件（java和html）和生成的PDF（请参见RFQMerge.pdf）在这里： the source code, files and result

接受的答案对我不起作用，我在此代码的第二行中不断获取NullPointerException：

PdfDictionary names = targetPdf.getCatalog().getPdfObject().getAsDictionary(PdfName.Names); names.put(PdfName.Dests, replaceDict);

以下是代码和输入/源代码文件：https://www.dropbox.com/s/kg7vsb0j3hbkfca/stackoverflowClarification.zip?dl=0

Answer 1

您的问题如下：iText生成具有相同轮廓名称的pdf，并且在合并过程中不解决它们（相反，iText会记录警告，并用新的目的地替换旧的目的地）。

有两种方法可以处理上述情况：

1）创建具有唯一轮廓名称的pdf。不幸的是，目前OutlineHandler的实现太私有了，无法正确覆盖它。但是，您可以根据需要构建pdfHTML的自定义版本。仓库位于https://github.com/itext/i7j-pdfhtml，您对OutlineHandler的reset方法感兴趣：

 /**
 * Resets the current state so that this {@link OutlineHandler} is ready to process new document
 */
public void reset() {
    currentOutline = null;
    destinationsInProcess.clear();
    levelsInProcess.clear();
    uniqueIDs.clear();
}

只需注释其最后一行并构建jar。

2）如果您知道文档目的地会带来一些麻烦，请重命名。即使PdfMerger只是用新目的地替换了旧目的地，它也会记录有关该目的地的警告。您可以获取已被覆盖的目标名称，并在合并之前手动重命名。

要遵循这种方式，应该： a）更新目的地名称：

    PdfNameTree destsTree = updateDestNamesDocument.getCatalog().getNameTree(PdfName.Dests);
    PdfNameTree newNameTree = new PdfNameTree(updateDestNamesDocument.getCatalog(), PdfName.Dests);
    for (Map.Entry<String, PdfObject> entry : destsTree.getNames().entrySet()) {
        newNameTree.addEntry(prefix + entry.getKey(), entry.getValue());
    }
    PdfDictionary replaceDict = newNameTree.buildTree();
    replaceDict.makeIndirect(updateDestNamesDocument);

    PdfDictionary names = updateDestNamesDocument.getCatalog().getPdfObject().getAsDictionary(PdfName.Names);
    names.put(PdfName.Dests, replaceDict);

b）更新轮廓：

    PdfOutline rootOutline = updateDestNamesDocument.getOutlines(false);
    updateOutlines(rootOutline, prefix);

    private void updateOutlines(PdfOutline parentOutline, String prefix) {
    for (PdfOutline outline : parentOutline.getAllChildren()) {
        updateOutlines(outline, prefix);
    }
    if (parentOutline.getDestination() instanceof PdfStringDestination) {
        parentOutline.addDestination(new PdfStringDestination(prefix + ((PdfString)parentOutline.getDestination().getPdfObject()).getValue()));
    }
}

然后您可以成功合并pdf。

Answer 2

谢谢Uladzimir Asipchuk的回答，它对我有用。

根据我的要求，我已经稍作更改，我的要求是合并两个或多个可以从html创建或已经创建的pdf或两者的pdf。

我在调用rebuild（）方法时遇到了一些问题

PdfDictionary replaceDict = newNameTree.buildTree();
replaceDict.makeIndirect(updateDestNamesDocument);

com.itextpdf.kernel.PdfException：没有关联的PdfWriter可以用于制造 cts。

因此，我刚刚将其添加为具有自定义名称的新条目，并在轮廓上也进行了更新。它正在工作

try {
    String prefix = "cus-" + (index) + "-";
    PdfNameTree destsTree = pdf.getCatalog().getNameTree(PdfName.Dests);
    PdfNameTree newNameTree = new PdfNameTree(pdf.getCatalog(), PdfName.Dests);
    for (Map.Entry<String, PdfObject> entry : destsTree.getNames().entrySet()) {
        newNameTree.addEntry(prefix + entry.getKey(), entry.getValue());
    }

    for (Map.Entry<String, PdfObject> entry : newNameTree.getNames().entrySet()) {
        destsTree.addEntry(prefix + entry.getKey(), entry.getValue());
        System.out.println(entry.getKey() +"==>>"+ entry.getValue());
    }


    PdfOutline rootOutline = pdf.getOutlines(false);
    updateOutlines(rootOutline, prefix);
} catch (Exception e) {
    e.printStackTrace();
}

将新添加的目标映射到轮廓

public static void updateOutlines(PdfOutline parentOutline, String prefix) {
    for (PdfOutline outline : parentOutline.getAllChildren()) {
        updateOutlines(outline, prefix);
    }
    if (parentOutline.getDestination() instanceof PdfStringDestination) {
        parentOutline.addDestination(new PdfStringDestination(prefix + ((PdfString) parentOutline.getDestination().getPdfObject()).getValue()));
    }
}

对于我来说，这一直正常工作，请在更改后进行合并，这样您的所有目的地看起来都是唯一的，并且合并时不会被覆盖。

itext7合并由HtmlConverter.convertToDocument创建的文档并保留大纲

2 个答案: