我使用 iText PDF java 从PDF中提取XML。它工作正常,但跳过只读字段。生成的XML中不存在只读字段。我使用以下代码来提取XML
public class PDFReadExample
{
public static void main(String[] args) throws IOException, DocumentException, TransformerException
{
String SRC = "";
String DEST = "";
for (String s : args) {
SRC = args[0];
DEST = args[1];
}
File file = new File(DEST);
file.getParentFile().mkdirs();
new PDFReadExample().readXml(SRC, DEST);
}
public void readXml(String src, String dest) throws IOException, DocumentException, TransformerException
{
PdfReader reader = new PdfReader(src);
AcroFields form = reader.getAcroFields();
XfaForm xfa = form.getXfa();
Node node = xfa.getDatasetsNode();
NodeList list = node.getChildNodes();
for (int i = 0; i < list.getLength(); ++i) {
if ("data".equals(list.item(i).getLocalName())) {
node = list.item(i);
break;
}
}
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty("encoding", "UTF-8");
tf.setOutputProperty("indent", "yes");
FileOutputStream os = new FileOutputStream(dest);
tf.transform(new DOMSource(node), new StreamResult(os));
reader.close();
}
}
我对PDF不太熟悉。根据来自其他字段的输入自动填充只读字段。我如何在XML中提取只读值。