JAXP saxon-he:XMLfile StreamSource在解析错误后不会释放文件访问权限

时间:2016-08-30 14:53:42

标签: java saxon jaxp

我使用JAXP规范API与Saxon-HE API结合使用,主要目的是使用可配置的XSLT样式表开发一个转换XML文件的应用程序,能够覆盖生成的输出文档。我跳过细节,因为我创建了一个示例项目来说明遇到的问题:

用例:如果发生转换错误,将xml文件移动到另一个目录(可能是错误目录)会引发访问异常。

当我基于File实例(指向XML文件)实例化StreamSource时,如果发生某些解析错误,则移动该文件会引发该文件,因为该文件正由另一个进程使用,因此该文件无法访问该文件。 "异常。

这是我写的一个主要的单一类应用程序来说明问题:

package com.sample.xslt.application;

import net.sf.saxon.Configuration;
import net.sf.saxon.lib.FeatureKeys;

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.stream.StreamSource;

public class XsltApplicationSample {

  public static void main(String[] args) throws Exception {

    if (args.length != 2) {
      throw new RuntimeException("Two arguments are expected : <xslFilePath> <inputFilePath>");
    }
    String xslFilePath = args[0];
    String xmlFilePath = args[1];

    TransformerFactory factory = TransformerFactory.newInstance();
    factory.setAttribute(FeatureKeys.ALLOW_MULTITHREADING, Boolean.TRUE);
    factory.setAttribute(FeatureKeys.RECOVERY_POLICY,
        new Integer(Configuration.RECOVER_WITH_WARNINGS));

    Source xslSource = new StreamSource(new File(xslFilePath));
    Source xmlSource = new StreamSource(new File(xmlFilePath));
    Transformer transformer = factory.newTransformer(xslSource);

    try {
      transformer.transform(xmlSource, new DOMResult());

    } catch (TransformerException e) {
      System.out.println(e.getMessage());
    }

    // move input file to tmp directory (for example, could be configured error dir)

    File srcFile = Paths.get(xmlFilePath).toFile();
    File tempDir = new File(System.getProperty("java.io.tmpdir"));

    Path destFilePath = new File(tempDir, srcFile.getName()).toPath();

    try {
      Files.move(srcFile.toPath(), destFilePath, StandardCopyOption.REPLACE_EXISTING);
    } catch (SecurityException | IOException e) {
      System.out.println(e.getMessage());
    }
  }
}

配置的xslt转换文件内容必须有效才能重现。 如果输入的xml文件为空,则会创建转换/解析错误,但访问文件错误不会发生。

要重现的输入文件示例:

<root>
    <elem>
</root>

STDOUT示例:

JAXP: find factoryId =javax.xml.transform.TransformerFactory
JAXP: find factoryId =javax.xml.parsers.SAXParserFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.parsers.SAXParserFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.parsers.DocumentBuilderFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl using ClassLoader: null
JAXP: find factoryId =javax.xml.parsers.SAXParserFactory
JAXP: loaded from fallback value: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
JAXP: created new instance of class com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl using ClassLoader: null
Error on line 3 column 3 of input_err.xml:
  SXXP0003: Error reported by XML parser: The element type "elem" must be terminated by the
  matching end-tag "</elem>".
org.xml.sax.SAXParseException; systemId: file:/C:/<path>/input_err.xml; lineNumber: 3; columnNumber: 3; The element type "elem" must be terminated by the matching end-tag "</elem>".
C:\<path>\input_err.xml -> C:\<path>\AppData\Local\Temp\input_err.xml: The process cannot access the file because it is being used by another process.

使用命令行(我使用Eclipse):

java ... -Djaxp.debug=1 -Dfile.encoding=UTF-8 -classpath <...> com.sample.xslt.application.XsltApplicationSample C:\<path>\transform.xsl C:\<path>\input_err.xml

使用了pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.sample</groupId>
    <artifactId>XsltExampleProject</artifactId>
    <version>1.0.0-SNAPSHOT</version>

    <name>XsltExampleProject</name>
    <description>XSLT example project</description>

    <dependencies>
        <dependency>
            <groupId>net.sf.saxon</groupId>
            <artifactId>Saxon-HE</artifactId>
            <version>9.7.0-7</version>
        </dependency>

        <dependency>
            <groupId>commons-io</groupId>
            <artifactId>commons-io</artifactId>
            <version>2.5</version>
        </dependency>

        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-lang3</artifactId>
            <version>3.2.1</version>
        </dependency>
    </dependencies>

    <build>
        <sourceDirectory>src</sourceDirectory>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.3</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <encoding>UTF-8</encoding>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

我使用的解决方法是将xml输入文件的内容作为String加载到内存中,请参阅以下内容:

String xmlContent = FileUtils.readFileToString(new File(xmlFilePath), StandardCharsets.UTF_8);

Source xslSource = new StreamSource(new File(xslFilePath));
Source xmlSource = new StreamSource(new StringReader(xmlContent));

初始化变形金刚时我会错过什么吗? 默认已解决的SAX Parser应该被覆盖到Saxon推荐的另一个API吗?我认为Xerces解析器是根据调试日志记录使用的,但它是否与Saxon提供的转换器实现完全兼容? 我对这个有点困惑..

感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

在问题后面的评论帖子中,它似乎是随JDK提供的XML解析器中的错误/缺陷。您的选择是:

(a)报告错误并耐心等待它被修复

(b)改为使用Apache Xerces解析器

(c)不是提供文件,而是提供FileInputStream,并自行关闭它。

我的建议是(b),因为Apache Xerces解析器比JDK中的版本更可靠。