Question

我需要帮助，使用PDFBox API将Cyrillic值添加到字段中。以下是我到目前为止的情况：

PDDocument document = PDDocument.load(file);
PDDocumentCatalog dc = document.getDocumentCatalog();
PDAcroForm acroForm = dc.getAcroForm();
PDField naziv = acroForm.getField("naziv");
naziv.setValue("Наслов"); // this part right here
naziv.setValue("Naslov"); // it works like this

当我输入拉丁字母时，它完美无缺。但我也需要处理西里尔语输入。我该怎么办？

P.S。这是我得到的例外：引起：java.lang.IllegalArgumentException：U + 043D（'afii10079'）在此字体中不可用Helvetica编码：WinAnsiEncoding

Answer 1

下面的代码在acroform默认资源字典中添加了适当的字体，并替换了默认外观中的名称。当您调用setValue（）时，PDFBox使用新字体重新创建字段的外观流。

public static void main(String[] args) throws IOException
{
    PDDocument doc = PDDocument.load(new File("ZPe.pdf"));
    PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
    PDResources dr = acroForm.getDefaultResources();

    // Important: the font is Type0 (allows more than 256 glyphs) and NOT SUBSETTED
    PDFont font = PDType0Font.load(doc, new FileInputStream("c:/windows/fonts/arial.ttf"), false);

    COSName fontName = dr.add(font);
    Iterator<PDField> it = acroForm.getFieldIterator();
    while (it.hasNext())
    {
        PDField field = it.next();
        if (field instanceof PDTextField)
        {
            PDTextField textField = (PDTextField) field;
            String da = textField.getDefaultAppearance();

            // replace font name in default appearance string
            Pattern pattern = Pattern.compile("\\/(\\w+)\\s.*");
            Matcher matcher = pattern.matcher(da);
            if (!matcher.find() || matcher.groupCount() < 2)
            {
                // oh-oh
            }
            String oldFontName = matcher.group(1);
            da = da.replaceFirst(oldFontName, fontName.getName());

            textField.setDefaultAppearance(da);
        }
    }
    acroForm.getField("name1").setValue("Наслов");
    doc.save("result.pdf");
    doc.close();
}

更新4.4.2019：为了节省一些空间，在调用setValue之前删除外观可能很有用：

acroForm.getField("name1").getWidgets().get(0).setAppearance(null);

检查AcroForm默认资源中是否有未使用的字体，请参阅this answer。

更新7.4.2019：如果字体非常大（例如ArialUni）并且要设置许多字段（PDFBOX-4508），则可能会遇到性能不佳的情况。在这种情况下，请在调用setValue之前保存并重新加载文件。

要查明字体是否支持预期文字，请致电PDFont.encode()并检查IllegalArgumentException。

PDFBox API：如何在AcroForm字段中处理Cyrillic值

1 个答案: