我有一个相当长的xml / rdf / owl文件,我需要将其解析为我正在使用的一组字典。为了能够在将来对我的解析器进行单元测试,我需要提取xml文件的代表性子集,而不会破坏xml / rdf / owl堆栈语法。
有没有比从真正解析的文件中手动复制元素更好的方法呢?
答案 0 :(得分:3)
您可以使用Protégé Refactor
- > Copy/move/delete axioms...
菜单项,用于选择子集并将其导出到新文件。
您可以使用Pellet reasoner的提取功能。这允许您根据断言的类型提取子集。
PelletExtractInferences: Extract a set of inferences from an ontology
Usage: pellet extract [options] <file URI>...
Argument description:
--help, -h
Print this message
--verbose, -v
Print full stack trace for errors.
--config, -C (configuration file)
Use the selected configuration file
--statements, -s (Space separated list surrounded by quotes)
Statements to extract. The option accepts all axioms of the OWL functional
syntax plus some additional ones. Valid arguments are: [DefaultStatements,
AllClass, AllIndividual, AllProperty, AllStatements,
AllStatementsIncludingJena, ClassAssertion, ComplementOf,
DataPropertyAssertion, DifferentIndividuals, DirectClassAssertion,
DirectSubClassOf, DirectSubPropertyOf, DisjointClasses,
DisjointProperties, EquivalentClasses, EquivalentProperties,
InverseProperties, ObjectPropertyAssertion, PropertyAssertion,
SameIndividual, SubClassOf, SubPropertyOf]. Example: "DirectSubClassOf
DirectSubPropertyOf" (Default: DefaultStatements)
--loader, -l (Jena | OWLAPI | OWLAPIv3 | KRSS)
Use Jena, OWLAPI, OWLAPIv3 or KRSS to load the ontology (Default:
OWLAPIv3)
--ignore-imports
Ignore imported ontologies
--input-format (RDF/XML | Turtle | N-Triples)
Format of the input file (valid only for the Jena loader). Default
behaviour is to guess the input format based on the file extension.