提取xml文件的代表性子集以进行解析器测试

时间：2015-12-11 13:03:10

标签： xml rdf owl

我有一个相当长的xml / rdf / owl文件，我需要将其解析为我正在使用的一组字典。为了能够在将来对我的解析器进行单元测试，我需要提取xml文件的代表性子集，而不会破坏xml / rdf / owl堆栈语法。

有没有比从真正解析的文件中手动复制元素更好的方法呢？

1 个答案:

答案 0 :(得分：3)

使用Protégé编辑器

的解决方案1

您可以使用Protégé Refactor - ＆gt; Copy/move/delete axioms...菜单项，用于选择子集并将其导出到新文件。

使用Pellet推理器的解决方案2

您可以使用Pellet reasoner的提取功能。这允许您根据断言的类型提取子集。

PelletExtractInferences: Extract a set of inferences from an ontology

Usage: pellet extract [options] <file URI>...

Argument description:

--help, -h 
     Print this message 

--verbose, -v 
     Print full stack trace for errors. 

--config, -C (configuration file) 
     Use the selected configuration file 

--statements, -s (Space separated list surrounded by quotes) 
     Statements to extract. The option accepts all axioms of the OWL functional 
     syntax plus some additional ones. Valid arguments are: [DefaultStatements, 
     AllClass, AllIndividual, AllProperty, AllStatements, 
     AllStatementsIncludingJena, ClassAssertion, ComplementOf, 
     DataPropertyAssertion, DifferentIndividuals, DirectClassAssertion, 
     DirectSubClassOf, DirectSubPropertyOf, DisjointClasses, 
     DisjointProperties, EquivalentClasses, EquivalentProperties, 
     InverseProperties, ObjectPropertyAssertion, PropertyAssertion, 
     SameIndividual, SubClassOf, SubPropertyOf]. Example: "DirectSubClassOf 
     DirectSubPropertyOf" (Default: DefaultStatements) 

--loader, -l (Jena | OWLAPI | OWLAPIv3 | KRSS) 
     Use Jena, OWLAPI, OWLAPIv3 or KRSS to load the ontology (Default: 
     OWLAPIv3) 

--ignore-imports 
     Ignore imported ontologies 

--input-format (RDF/XML | Turtle | N-Triples) 
     Format of the input file (valid only for the Jena loader). Default 
     behaviour is to guess the input format based on the file extension.