PDFBox扰乱文本

时间:2013-10-31 09:28:08

标签: java pdf pdfbox

我一直在尝试编辑PDF文档以预先填写表单条目。我有它的工作(有点)。我正在添加的文字很好。然而,已经存在的其他文本似乎已被“&%£!£!符号替换。我已经知道它与下面的代码中的”contentStream“部分有关。它似乎是“setFont”行。如果我删除它,页面仍然正常...除了“Hello Richard”文本不再显示!

请帮忙!

package pdfboxtest;

import java.awt.Color;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;

public class PDFFormFiller {

    private static final String R40_NEW_FORM_PATH = "c:\\temp\\hmrc-r40.pdf";
    private static final String R40_COMPLETED_FORM_PATH = "c:\\temp\\hmrc-r40-complete.pdf";

    public static void main(String[] args) throws Exception {
        PDDocument doc = PDDocument.load(R40_NEW_FORM_PATH);

        addTextToPage(doc);

        doc.save(R40_COMPLETED_FORM_PATH);
        doc.close();
    }

    private static void addTextToPage(PDDocument doc) throws Exception {
        List pages = doc.getDocumentCatalog().getAllPages();
        PDPage firstPage = (PDPage) pages.get(0);
        PDPageContentStream contentStream = new PDPageContentStream(doc, firstPage, true, true);

        contentStream.setFont(PDType1Font.HELVETICA_BOLD, 24);
        contentStream.beginText();
        contentStream.setNonStrokingColor(Color.BLACK);
        contentStream.moveTextPositionByAmount(100, 200);
        contentStream.drawString("HELLO RICHARD!!");
        contentStream.endText();
        contentStream.close();

    }
}

This is the top of the form before I add text elsewhere And after I've added text elsewhere, this bit of text goes nuts! I did not edit this bit though

1 个答案:

答案 0 :(得分:1)

正如在评论中已经假设的那样,这是由于PDFBox问题我在this answer中描述了一种解决方法。这个问题仍然存在于PDFBox的1.8.2版本中但是同时已修复版本1.8.3和2.0.0,参见PDFBOX-1753

在您的情况下,变通方法会更改addTextToPage方法,如下所示:

private static void addTextToPage(PDDocument doc) throws IOException {
    List pages = doc.getDocumentCatalog().getAllPages();
    PDPage firstPage = (PDPage) pages.get(0);
    PDPageContentStream contentStream = new PDPageContentStream(doc, firstPage, true, true);

    firstPage.getResources().getFonts(); // <<<<<<

    contentStream.setFont(PDType1Font.HELVETICA_BOLD, 24);
    contentStream.beginText();
    contentStream.setNonStrokingColor(Color.BLACK);
    contentStream.moveTextPositionByAmount(100, 200);
    contentStream.drawString("HELLO RICHARD!!");
    contentStream.endText();
    contentStream.close();
}

添加的行强制执行new PDPageContentStream忘记但setFont已完成的初始化。您可以在上面引用的答案中找到详细信息。您可能想要通知PDFBox开发。