我想构建一个XML文档验证器。一个程序,它遍历XML文档,并根据定义的模式查找属性重复和一致性(不是XML与标准一致,但属性符合特定规则)。
我有使用经验:
您会为此类任务推荐哪种语言/图书馆/扩展程序?
提前致谢
答案 0 :(得分:0)
我会使用您喜欢的任何语言中的libxml2或其中一种实现方式。如何验证特定文档取决于它使用的XML方言。目前有三种常见的验证机制:DTD,RelaxNG和XML-Schema,每种自尊的方言都会产生至少一种方言规范。
以下用于使用RelaXNG验证MathML文档的C版本:
static const xmlChar
mml_rng_uri[] = "http://www.w3.org/Math/RelaxNG/mathml3/mathml3.rng";
/**
* @brief Validate the MathML document located at the given URI.
*/
/*
* -- Implementation notes --
*
* The goal is xmlRelaxGNValidateDoc.
* For that we need a xmlDocPtr for the document and xmlRelaxNGValidCtxtPtr
* for the RelaxNG schema.
* Given a uri we can use xmlCtxtReadFile for the document.
* We will also need a validation schema, which is always the result of a
* RelaxNG parse operation.
* The parse operation requires a parser context obtained from either
* xmlRelaxNGNewParserCtxt, which takes an URI or xmlRelaxNGNewMemParserCtxt
* which takes a pointer and size.
*
* -- Short hand --
* xmlRelaxNGValidateDoc()
* |
* |- xmlDocPtr = xmlCtxtReadFile()
* | |
* | |- xmlParserCtxtPtr = xmlNewParserCtxt()
* |
* |- xmlRelaxNGValidCtxtPtr = xmlRelaxNGNewValidCtxt()
* | |
* | |- xmlRelaxNGPtr = xmlRelaxNGParse()
* | | |
* | | |- xmlRelaxNGParserCtxtPtr = xmlRelaxNGNewParserCtxt()
* | | |- xmlRelaxNGParserCtxtPtr = xmlRelaxNGNewMemParserCtxt()
*/
int MML_validate(const char *uri)
{
xmlDocPtr doc;
xmlParserCtxtPtr docparser;
xmlRelaxNGValidCtxtPtr validator;
xmlRelaxNGPtr schema;
xmlRelaxNGParserCtxtPtr rngparser;
int retval;
/* RelaxNG schema setup */
rngparser = xmlRelaxNGNewParserCtxt(mml_rng_uri);
if( (schema = xmlRelaxNGParse(rngparser)) == NULL )
errx(1, "Failed to parse MathML RelaxNG schema");
if( (validator = xmlRelaxNGNewValidCtxt(schema)) == NULL )
errx(1, "Failed to create a RelaxNG validator");
/* MathML document setup */
if( (docparser = xmlNewParserCtxt()) == NULL )
errx(1, "Failed to create a document parser");
if( (doc = xmlCtxtReadFile(docparser, uri, NULL, XML_PARSE_XINCLUDE)) ==
NULL )
errx(1, "Failed to parse document at %s", uri);
/* Validation */
retval = xmlRelaxNGValidateDoc(validator, doc);
/* Clean up */
xmlRelaxNGFreeValidCtxt(validator);
xmlRelaxNGFreeParserCtxt(rngparser);
xmlRelaxNGFree(schema);
return(retval);
}
答案 1 :(得分:0)
要求声明要做出明确的陈述,这太短了,但这对我来说听起来像Schematron问题。