我需要弄清楚如何在架构脱机的情况下验证我的XML文件。环顾四天后,我能找到的基本上是我需要对模式进行内部引用。我需要找到它们,下载它们,并将引用更改为本地系统路径。我无法找到的正是如何做到这一点。在哪里以及如何将引用更改为内部而非外部?下载模式的最佳方法是什么?
答案 0 :(得分:0)
有三种方法可以做到这一点。它们的共同点是您需要模式文档的本地副本。我假设实例文档当前使用xsi:schemaLocation和/或xsi:noNamespaceSchemaLocation指向在Web上保存架构文档的位置。
(a)修改实例文档以引用模式文档的本地副本。这通常很不方便。
(b)重定向引用,以便将对远程文件的请求重定向到本地文件。设置它的方法取决于您使用的模式验证器以及如何调用它。
(c)告诉架构处理器忽略xsi:schemaLocation和xsi:noNamespaceSchemaLocation的值,而是针对使用架构处理器的调用API提供的架构进行验证。同样,细节取决于您使用的架构处理器。
我首选的方法是(c):如果只是因为当您验证源文档时,那么根据定义您并不完全信任它 - 那么为什么要相信它包含正确的xsi:schemaLocation属性?
答案 1 :(得分:0)
XmlValidate是一个简单但功能强大的命令行工具,可以针对目标模式执行单个或多个XML文件的脱机验证。它可以按文件名,目录或URL扫描本地xml文件。
XmlValidate会根据架构命名空间和映射到本地文件的配置文件自动添加schemaLocation。该工具将验证配置文件中引用的任何XML Schema。
以下是配置文件中命名空间到目标Schema的示例映射:
http://www.opengis.net/kml/2.2=${XV_HOME}/schemas/kml22.xsd
http://appengine.google.com/ns/1.0=C:/xml/appengine-web.xsd
urn:oasis:names:tc:ciq:xsdschema:xAL:2.0=C:/xml/xAL.xsd
请注意,上面的 $ {XV_HOME} 标记只是运行XmlValidate的顶级目录的别名。该位置同样可以是完整的文件路径。
XmlValidate是一个使用Java Runtime Environment (JRE)运行的开源项目(可用源代码)。捆绑的应用程序(Java jar,示例等)可以下载here。
如果XmlValidate以批处理模式针对多个XML文件运行,它将提供验证结果的摘要。
Errors: 17 Warnings: 0 Files: 11 Time: 1506 ms
Valid files 8/11 (73%)
答案 2 :(得分:0)
您可以将ResourceResolver和LSInput的实施设置为SchemaFactory,以便 LSInput.getCharacterStream()的https://schema.datacite.org/meta/kernel-4.1/将提供本地路径的模式。
我已经编写了一个额外的类来进行离线验证。您可以将其称为
new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
"schemas/datacite/kernel-4.1/");
两个InputStream正在通过。一个用于xml,一个用于模式。 baseUrl和localPath(类路径上的相对)作为第三个和第四个参数传递。验证器使用最后两个参数在localPath本地或相对于提供的baseUrl查找其他模式。
完整示例:
@Test
public void validate4() throws Exception {
InputStream xmlStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(
"schemas/datacite/kernel-4.1/example/datacite-example-complicated-v4.1.xml");
InputStream schemaStream = Thread.currentThread().getContextClassLoader()
.getResourceAsStream("schemas/datacite/kernel-4.1/metadata.xsd");
new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
"schemas/datacite/kernel-4.1/");
}
XmlSchemaValidator将根据模式验证xml,并在本地搜索包含的模式。它使用ResourceResolver覆盖标准行为并在本地搜索。
public class XmlSchemaValidator {
/**
* @param xmlStream
* xml data as a stream
* @param schemaStream
* schema as a stream
* @param baseUri
* to search for relative pathes on the web
* @param localPath
* to search for schemas on a local directory
* @throws SAXException
* if validation fails
* @throws IOException
* not further specified
*/
public void validate(InputStream xmlStream, InputStream schemaStream, String baseUri, String localPath)
throws SAXException, IOException {
Source xmlFile = new StreamSource(xmlStream);
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
factory.setResourceResolver((type, namespaceURI, publicId, systemId, baseURI) -> {
LSInput input = new DOMInputImpl();
input.setPublicId(publicId);
input.setSystemId(systemId);
input.setBaseURI(baseUri);
input.setCharacterStream(new InputStreamReader(
getSchemaAsStream(input.getSystemId(), input.getBaseURI(), localPath)));
return input;
});
Schema schema = factory.newSchema(new StreamSource(schemaStream));
javax.xml.validation.Validator validator = schema.newValidator();
validator.validate(xmlFile);
}
private InputStream getSchemaAsStream(String systemId, String baseUri, String localPath) {
InputStream in = getSchemaFromClasspath(systemId, localPath);
// You could just return in; , if you are sure that everything is on
// your machine. Here I call getSchemaFromWeb as last resort.
return in == null ? getSchemaFromWeb(baseUri, systemId) : in;
}
private InputStream getSchemaFromClasspath(String systemId, String localPath) {
System.out.println("Try to get stuff from localdir: " + localPath + systemId);
return Thread.currentThread().getContextClassLoader().getResourceAsStream(localPath + systemId);
}
/*
* You can leave out the webstuff if you are sure that everything is
* available on your machine
*/
private InputStream getSchemaFromWeb(String baseUri, String systemId) {
try {
URI uri = new URI(systemId);
if (uri.isAbsolute()) {
System.out.println("Get stuff from web: " + systemId);
return urlToInputStream(uri.toURL(), "text/xml");
}
System.out.println("Get stuff from web: Host: " + baseUri + " Path: " + systemId);
return getSchemaRelativeToBaseUri(baseUri, systemId);
} catch (Exception e) {
// maybe the systemId is not a valid URI or
// the web has nothing to offer under this address
}
return null;
}
private InputStream urlToInputStream(URL url, String accept) {
HttpURLConnection con = null;
InputStream inputStream = null;
try {
con = (HttpURLConnection) url.openConnection();
con.setConnectTimeout(15000);
con.setRequestProperty("User-Agent", "Name of my application.");
con.setReadTimeout(15000);
con.setRequestProperty("Accept", accept);
con.connect();
int responseCode = con.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_MOVED_PERM
|| responseCode == HttpURLConnection.HTTP_MOVED_TEMP || responseCode == 307
|| responseCode == 303) {
String redirectUrl = con.getHeaderField("Location");
try {
URL newUrl = new URL(redirectUrl);
return urlToInputStream(newUrl, accept);
} catch (MalformedURLException e) {
URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
return urlToInputStream(newUrl, accept);
}
}
inputStream = con.getInputStream();
return inputStream;
} catch (SocketTimeoutException e) {
throw new RuntimeException(e);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private InputStream getSchemaRelativeToBaseUri(String baseUri, String systemId) {
try {
URL url = new URL(baseUri + systemId);
return urlToInputStream(url, "text/xml");
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
}
打印
Try to get stuff from localdir: schemas/datacite/kernel-4.1/http://www.w3.org/2009/01/xml.xsd
Get stuff from web: http://www.w3.org/2009/01/xml.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-titleType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-contributorType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-dateType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-resourceType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relationType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relatedIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-funderIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-descriptionType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-nameType-v4.1.xsd
该打印显示验证器能够针对一组本地模式进行验证。只有
http://www.w3.org/2009/01/xml.xsd
在本地无法使用,因此从互联网上获取。