我需要在 Ubuntu 服务器上使用 Apache FOP 后端将 docx 文档导出为 PDF/A-1b。
该文档没什么特别的,它使用基本的 Windows 字体 Calibri、Courier New、Times New Roman、Symbol、Wingdings。
PDF/A-1b 配置文件需要嵌入所有字体,包括标准的 base-14 字体,因此我从 /usr/share/fonts/type1/urw-base35 中提取了 Ubuntu Type1 字体,并且我有 14 个 .pfb 和 14 个/home/luca/Desktop/ubuntufonts/
路径中的 .afm 文件。
我认为我正确设置了所有内容,但启用 A-1b 配置文件会导致以下异常:
Caused by: java.io.FileNotFoundException: Neither an AFM nor a PFM file was found for NimbusRoman-BoldItalic.pfb
at org.apache.fop.fonts.type1.Type1FontLoader.read(Type1FontLoader.java:147)
at org.apache.fop.fonts.FontLoader.getFont(FontLoader.java:126)
at org.apache.fop.fonts.FontLoader.loadFont(FontLoader.java:110)
at org.apache.fop.fonts.LazyFont.load(LazyFont.java:119)
...
Caused by: java.lang.RuntimeException: Failed to read font file NimbusRoman-BoldItalic.pfb
at org.apache.fop.fonts.LazyFont.load(LazyFont.java:132)
at org.apache.fop.fonts.LazyFont.hasChar(LazyFont.java:179)
at org.apache.fop.fonts.Font.hasChar(Font.java:278)
at org.apache.fop.fonts.FontSelector.selectFontForCharacter(FontSelector.java:47)
at org.apache.fop.fonts.FontSelector.selectFontForCharacterInText(FontSelector.java:85)
at org.apache.fop.layoutmgr.inline.TextLayoutManager.initialize(TextLayoutManager.java:162)
at org.apache.fop.layoutmgr.AbstractLayoutManager.getChildLM(AbstractLayoutManager.java:118)
但文件就在那里:
luca@luca-vm:~/Desktop/ubuntufonts$ ls
D050000L.afm NimbusRoman-Italic.afm
D050000L.pfb NimbusRoman-Italic.pfb
NimbusMonoPS-Bold.afm NimbusRoman-Regular.afm
NimbusMonoPS-BoldItalic.afm NimbusRoman-Regular.pfb
NimbusMonoPS-BoldItalic.pfb NimbusSans-Bold.afm
NimbusMonoPS-Bold.pfb NimbusSans-BoldItalic.afm
NimbusMonoPS-Italic.afm NimbusSans-BoldItalic.pfb
NimbusMonoPS-Italic.pfb NimbusSans-Bold.pfb
NimbusMonoPS-Regular.afm NimbusSans-Italic.afm
NimbusMonoPS-Regular.pfb NimbusSans-Italic.pfb
NimbusRoman-Bold.afm NimbusSans-Regular.afm
NimbusRoman-BoldItalic.afm NimbusSans-Regular.pfb
NimbusRoman-BoldItalic.pfb StandardSymbolsPS.afm
NimbusRoman-Bold.pfb StandardSymbolsPS.pfb
从网络搜索来看,似乎继续的方法是创建一个 fop.xml 配置文件,将字体名称映射到我提取的文件。这是我准备的文件:
<fop version="1.0">
<font-base>/home/luca/Desktop/ubuntufonts/</font-base>
<renderers>
<renderer mime="application/pdf">
<fonts>
<font embed-url="NimbusSans-Regular.pfb" embedding-mode="full">
<font-triplet name="Helvetica" style="normal" weight="normal" />
<font-triplet name="Calibri" style="normal" weight="normal" />
</font>
<font embed-url="NimbusSans-Bold.pfb" embedding-mode="full">
<font-triplet name="Helvetica" style="normal" weight="bold" />
<font-triplet name="Calibri" style="normal" weight="bold" />
</font>
<font embed-url="NimbusSans-Italic.pfb" embedding-mode="full">
<font-triplet name="Helvetica" style="italic" weight="normal" />
<font-triplet name="Calibri" style="italic" weight="normal" />
</font>
<font embed-url="NimbusSans-BoldItalic.pfb" embedding-mode="full">
<font-triplet name="Helvetica" style="italic" weight="bold" />
<font-triplet name="Calibri" style="italic" weight="bold" />
</font>
<font embed-url="NimbusRoman-Regular.pfb" embedding-mode="full">
<font-triplet name="Times" style="normal" weight="normal" />
<font-triplet name="Times New Roman" style="normal" weight="normal" />
</font>
<font embed-url="NimbusRoman-Bold.pfb" embedding-mode="full">
<font-triplet name="Times" style="normal" weight="bold" />
<font-triplet name="Times New Roman" style="normal" weight="normal" />
</font>
<font embed-url="NimbusRoman-Italic.pfb" embedding-mode="full">
<font-triplet name="Times" style="italic" weight="normal" />
<font-triplet name="Times New Roman" style="normal" weight="normal" />
</font>
<font embed-url="NimbusRoman-BoldItalic.pfb" embedding-mode="full">
<font-triplet name="Times" style="italic" weight="bold" />
<font-triplet name="Times New Roman" style="normal" weight="normal" />
</font>
<font embed-url="NimbusMonoPS-Regular.pfb" embedding-mode="full">
<font-triplet name="Courier" style="normal" weight="normal" />
<font-triplet name="Courier New" style="normal" weight="normal" />
</font>
<font embed-url="NimbusMonoPS-Bold.pfb" embedding-mode="full">
<font-triplet name="Courier" style="normal" weight="bold" />
<font-triplet name="Courier New" style="normal" weight="bold" />
</font>
<font embed-url="NimbusMonoPS-Italic.pfb" embedding-mode="full">
<font-triplet name="Courier" style="italic" weight="normal" />
<font-triplet name="Courier New" style="italic" weight="normal" />
</font>
<font embed-url="NimbusMonoPS-BoldItalic.pfb" embedding-mode="full">
<font-triplet name="Courier" style="italic" weight="bold" />
<font-triplet name="Courier New" style="italic" weight="bold" />
</font>
<font embed-url="StandardSymbolsPS.pfb" embedding-mode="full">
<font-triplet name="Symbol" style="normal" weight="normal" />
<font-triplet name="Symbol" style="normal" weight="bold" />
</font>
<font embed-url="D050000L.pfb" embedding-mode="full">
<font-triplet name="ZapfDingbats" style="normal" weight="normal" />
<font-triplet name="ZapfDingbats" style="normal" weight="bold" />
</font>
</fonts>
</renderer>
</renderers>
</fop>
这是我使用的最终转换代码:
// Document loading (required)
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(classPathResource.getFile());
// Set up font mapper (optional)
Mapper fontMapper = new IdentityPlusMapper();
wordMLPackage.setFontMapper(fontMapper);
// FO exporter setup (required)
// .. the FOSettings object
String fopConfig = Files.readString(new ClassPathResource("fop.xml").getFile().toPath());
FOSettings foSettings = Docx4J.createFOSettings();
foSettings.setApacheFopConfiguration(fopConfig);
foSettings.setOpcPackage(wordMLPackage);
FOUserAgent foUserAgent = FORendererApacheFOP.getFOUserAgent(foSettings);
foUserAgent.getRendererOptions().put("pdf-a-mode", "PDF/A-1b");
// PDF/A-1a, PDF/A-2a and PDF/A-3a require accessibility to be enabled
// see further https://stackoverflow.com/a/54587413/1031689
foUserAgent.setAccessibility(true); // suppress "missing language information" messages from FOUserAgent .processEvent
ByteArrayOutputStream os = new ByteArrayOutputStream();
Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
// Clean up, so any ObfuscatedFontPart temp files can be deleted
if (wordMLPackage.getMainDocumentPart().getFontTablePart()!=null) {
wordMLPackage.getMainDocumentPart().getFontTablePart().deleteEmbeddedFontTempFiles();
}
// This would also do it, via finalize() methods
foSettings = null;
wordMLPackage = null;
我还尝试将文档字体直接嵌入 Word 文档中,并在各种尝试之间删除 fop 缓存,但这些操作并没有解决问题。
知道如何解决这个问题吗?
答案 0 :(得分:1)
在上面浪费了两天多之后找到了它。出于某种原因,元素必须以方案为前缀:
<font-base>file:/home/luca/Desktop/ubuntufonts/</font-base>
现在,我还想为未来受挫的读者指出,实际上没有理由使用 Type1 字体来映射 Base14 字体,所以帮自己一个忙并使用 OTF 字体映射它们(在我的 Ubuntu 虚拟机上,它们'位于 /usr/share/fonts/opentype/urw-base35
),因此不需要额外的 AFM/PFM 文件查找。
这是我最终的 xml 配置文件:
<fop version="1.0">
<font-base>file:/home/luca/Desktop/ubuntuttf/</font-base>
<use-cache>false</use-cache>
<strict-configuration>true</strict-configuration>
<renderers>
<renderer mime="application/pdf">
<fonts>
<font embed-url="NimbusSans-Regular.otf">
<font-triplet name="Helvetica" style="normal" weight="normal" />
<font-triplet name="Calibri" style="normal" weight="normal" />
<font-triplet name="sans-serif" style="normal" weight="normal"/>
<font-triplet name="SansSerif" style="normal" weight="normal"/>
</font>
<font embed-url="NimbusSans-Bold.otf">
<font-triplet name="Helvetica" style="normal" weight="bold" />
<font-triplet name="Calibri" style="normal" weight="bold" />
<font-triplet name="sans-serif" style="normal" weight="bold"/>
<font-triplet name="SansSerif" style="normal" weight="bold"/>
</font>
<font embed-url="NimbusSans-Italic.otf">
<font-triplet name="Helvetica" style="italic" weight="normal" />
<font-triplet name="Calibri" style="italic" weight="normal" />
<font-triplet name="sans-serif" style="italic" weight="normal"/>
<font-triplet name="SansSerif" style="italic" weight="normal"/>
</font>
<font embed-url="NimbusSans-BoldItalic.otf">
<font-triplet name="Helvetica" style="italic" weight="bold" />
<font-triplet name="Calibri" style="italic" weight="bold" />
<font-triplet name="sans-serif" style="italic" weight="bold"/>
<font-triplet name="SansSerif" style="italic" weight="bold"/>
</font>
<font embed-url="NimbusRoman-Regular.otf">
<font-triplet name="Times" style="normal" weight="normal" />
<font-triplet name="Times New Roman" style="normal" weight="normal" />
<font-triplet name="serif" style="normal" weight="normal"/>
<font-triplet name="any" style="normal" weight="normal"/>
</font>
<font embed-url="NimbusRoman-Bold.otf">
<font-triplet name="Times" style="normal" weight="bold" />
<font-triplet name="Times New Roman" style="normal" weight="bold" />
<font-triplet name="serif" style="normal" weight="bold"/>
<font-triplet name="any" style="normal" weight="bold"/>
</font>
<font embed-url="NimbusRoman-Italic.otf">
<font-triplet name="Times" style="italic" weight="normal" />
<font-triplet name="Times New Roman" style="italic" weight="normal" />
<font-triplet name="serif" style="italic" weight="normal"/>
<font-triplet name="any" style="italic" weight="normal"/>
</font>
<font embed-url="NimbusRoman-BoldItalic.otf">
<font-triplet name="Times" style="italic" weight="bold" />
<font-triplet name="Times New Roman" style="italic" weight="bold" />
<font-triplet name="serif" style="italic" weight="bold"/>
<font-triplet name="any" style="italic" weight="bold"/>
</font>
<font embed-url="NimbusMonoPS-Regular.otf">
<font-triplet name="Courier" style="normal" weight="normal" />
<font-triplet name="Courier New" style="normal" weight="normal" />
<font-triplet name="monospace" style="normal" weight="normal"/>
</font>
<font embed-url="NimbusMonoPS-Italic.otf">
<font-triplet name="Courier" style="normal" weight="bold" />
<font-triplet name="Courier New" style="normal" weight="bold" />
<font-triplet name="monospace" style="normal" weight="bold"/>
</font>
<font embed-url="NimbusMonoPS-Bold.otf">
<font-triplet name="Courier" style="italic" weight="normal" />
<font-triplet name="Courier New" style="italic" weight="normal" />
<font-triplet name="monospace" style="italic" weight="normal"/>
</font>
<font embed-url="NimbusMonoPS-BoldItalic.otf">
<font-triplet name="Courier" style="italic" weight="bold" />
<font-triplet name="Courier New" style="italic" weight="bold" />
<font-triplet name="monospace" style="italic" weight="bold"/>
</font>
<font embed-url="StandardSymbolsPS.otf">
<font-triplet name="Symbol" style="normal" weight="normal" />
<font-triplet name="Symbol" style="normal" weight="bold" />
</font>
<font embed-url="D050000L.otf">
<font-triplet name="ZapfDingbats" style="normal" weight="normal" />
<font-triplet name="ZapfDingbats" style="normal" weight="bold" />
</font>
</fonts>
</renderer>
</renderers>
</fop>
此外,如果有人有兴趣将字体嵌入到 jar/war 存档中,只需将 font-base 元素更改为 <font-base>classpath:/fonts/</font-base>
并将您的字体文件添加到 /src/main/resources/fonts/
下。