使用Java

时间:2017-07-09 12:22:36

标签: java xml

我需要弄清楚如何在架构脱机的情况下验证我的XML文件。环顾四天后,我能找到的基本上是我需要对模式进行内部引用。我需要找到它们,下载它们,并将引用更改为本地系统路径。我无法找到的正是如何做到这一点。在哪里以及如何将引用更改为内部而非外部?下载模式的最佳方法是什么?

3 个答案:

答案 0 :(得分:0)

有三种方法可以做到这一点。它们的共同点是您需要模式文档的本地副本。我假设实例文档当前使用xsi:schemaLocation和/或xsi:noNamespaceSchemaLocation指向在Web上保存架构文档的位置。

(a)修改实例文档以引用模式文档的本地副本。这通常很不方便。

(b)重定向引用,以便将对远程文件的请求重定向到本地文件。设置它的方法取决于您使用的模式验证器以及如何调用它。

(c)告诉架构处理器忽略xsi:schemaLocation和xsi:noNamespaceSchemaLocation的值,而是针对使用架构处理器的调用API提供的架构进行验证。同样,细节取决于您使用的架构处理器。

我首选的方法是(c):如果只是因为当您验证源文档时,那么根据定义您并不完全信任它 - 那么为什么要相信它包含正确的xsi:schemaLocation属性?

答案 1 :(得分:0)

XmlValidate是一个简单但功能强大的命令行工具,可以针对目标模式执行单个或多个XML文件的脱机验证。它可以按文件名,目录或URL扫描本地xml文件。

XmlValidate会根据架构命名空间和映射到本地文件的配置文件自动添加schemaLocation。该工具将验证配置文件中引用的任何XML Schema。

以下是配置文件中命名空间到目标Schema的示例映射:

http://www.opengis.net/kml/2.2=${XV_HOME}/schemas/kml22.xsd
http://appengine.google.com/ns/1.0=C:/xml/appengine-web.xsd
urn:oasis:names:tc:ciq:xsdschema:xAL:2.0=C:/xml/xAL.xsd

请注意,上面的 $ {XV_HOME} 标记只是运行XmlValidate的顶级目录的别名。该位置同样可以是完整的文件路径。

XmlValidate是一个使用Java Runtime Environment (JRE)运行的开源项目(可用源代码)。捆绑的应用程序(Java jar,示例等)可以下载here

如果XmlValidate以批处理模式针对多个XML文件运行,它将提供验证结果的摘要。

Errors: 17  Warnings: 0  Files: 11  Time: 1506 ms
Valid files 8/11 (73%)

答案 2 :(得分:0)

您可以将ResourceResolverLSInput的实施设置为SchemaFactory,以便 LSInput.getCharacterStream()https://schema.datacite.org/meta/kernel-4.1/将提供本地路径的模式。

我已经编写了一个额外的类来进行离线验证。您可以将其称为

new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
                        "schemas/datacite/kernel-4.1/");

两个InputStream正在通过。一个用于xml,一个用于模式。 baseUrl和localPath(类路径上的相对)作为第三个和第四个参数传递。验证器使用最后两个参数在localPath本地或相对于提供的baseUrl查找其他模式。

我使用enter image description here中的一组模式和示例进行了测试。

完整示例:

 @Test
 public void validate4() throws Exception {
        InputStream xmlStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(
                        "schemas/datacite/kernel-4.1/example/datacite-example-complicated-v4.1.xml");
        InputStream schemaStream = Thread.currentThread().getContextClassLoader()
                        .getResourceAsStream("schemas/datacite/kernel-4.1/metadata.xsd");
        new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
                        "schemas/datacite/kernel-4.1/");
 }

XmlSchemaValidator将根据模式验证xml,并在本地搜索包含的模式。它使用ResourceResolver覆盖标准行为并在本地搜索。

public class XmlSchemaValidator {
    /**
     * @param xmlStream
     *            xml data as a stream
     * @param schemaStream
     *            schema as a stream
     * @param baseUri
     *            to search for relative pathes on the web
     * @param localPath
     *            to search for schemas on a local directory
     * @throws SAXException
     *             if validation fails
     * @throws IOException
     *             not further specified
     */
    public void validate(InputStream xmlStream, InputStream schemaStream, String baseUri, String localPath)
                    throws SAXException, IOException {
        Source xmlFile = new StreamSource(xmlStream);
        SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        factory.setResourceResolver((type, namespaceURI, publicId, systemId, baseURI) -> {
            LSInput input = new DOMInputImpl();
            input.setPublicId(publicId);
            input.setSystemId(systemId);
            input.setBaseURI(baseUri);
            input.setCharacterStream(new InputStreamReader(
                            getSchemaAsStream(input.getSystemId(), input.getBaseURI(), localPath)));
            return input;
        });
        Schema schema = factory.newSchema(new StreamSource(schemaStream));
        javax.xml.validation.Validator validator = schema.newValidator();
        validator.validate(xmlFile);
    }

    private InputStream getSchemaAsStream(String systemId, String baseUri, String localPath) {
        InputStream in = getSchemaFromClasspath(systemId, localPath);
        // You could just return in; , if you are sure that everything is on
        // your machine. Here I call getSchemaFromWeb as last resort.
        return in == null ? getSchemaFromWeb(baseUri, systemId) : in;
    }

    private InputStream getSchemaFromClasspath(String systemId, String localPath) {
        System.out.println("Try to get stuff from localdir: " + localPath + systemId);
        return Thread.currentThread().getContextClassLoader().getResourceAsStream(localPath + systemId);
    }

    /*
     * You can leave out the webstuff if you are sure that everything is
     * available on your machine
     */
    private InputStream getSchemaFromWeb(String baseUri, String systemId) {
        try {
            URI uri = new URI(systemId);
            if (uri.isAbsolute()) {
                System.out.println("Get stuff from web: " + systemId);
                return urlToInputStream(uri.toURL(), "text/xml");
            }
            System.out.println("Get stuff from web: Host: " + baseUri + " Path: " + systemId);
            return getSchemaRelativeToBaseUri(baseUri, systemId);
        } catch (Exception e) {
            // maybe the systemId is not a valid URI or
            // the web has nothing to offer under this address
        }
        return null;
    }

    private InputStream urlToInputStream(URL url, String accept) {
        HttpURLConnection con = null;
        InputStream inputStream = null;
        try {
            con = (HttpURLConnection) url.openConnection();
            con.setConnectTimeout(15000);
            con.setRequestProperty("User-Agent", "Name of my application.");
            con.setReadTimeout(15000);
            con.setRequestProperty("Accept", accept);
            con.connect();
            int responseCode = con.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_MOVED_PERM
                            || responseCode == HttpURLConnection.HTTP_MOVED_TEMP || responseCode == 307
                            || responseCode == 303) {
                String redirectUrl = con.getHeaderField("Location");
                try {
                    URL newUrl = new URL(redirectUrl);
                    return urlToInputStream(newUrl, accept);
                } catch (MalformedURLException e) {
                    URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
                    return urlToInputStream(newUrl, accept);
                }
            }
            inputStream = con.getInputStream();
            return inputStream;
        } catch (SocketTimeoutException e) {
            throw new RuntimeException(e);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

    }

    private InputStream getSchemaRelativeToBaseUri(String baseUri, String systemId) {
        try {
            URL url = new URL(baseUri + systemId);
            return urlToInputStream(url, "text/xml");
        } catch (Exception e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }
    }
}

打印

Try to get stuff from localdir: schemas/datacite/kernel-4.1/http://www.w3.org/2009/01/xml.xsd
Get stuff from web: http://www.w3.org/2009/01/xml.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-titleType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-contributorType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-dateType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-resourceType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relationType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relatedIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-funderIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-descriptionType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-nameType-v4.1.xsd
  

该打印显示验证器能够针对一组本地模式进行验证。只有http://www.w3.org/2009/01/xml.xsd在本地无法使用,因此从互联网上获取。