无法使用OWL API解析一些海龟格式文件

时间:2017-01-20 16:30:31

标签: ontology protege owl-api turtle-rdf

我想阅读BioPortal的LNC / LOINC RDF / Turtle版本的类,可以在最新提交的http://bioportal.bioontology.org/ontologies/LOINC/找到。

我的解析代码就像

一样简单
OWLOntologyManager ontologyManager = OWLManager.createOWLOntologyManager();
ontologyManager.loadOntologyFromOntologyDocument(new File("LOINC.ttl"));

但是,我得到一个错误,没有解析器能够解析本体(由于字符限制而缩短):

   Exception in thread "main" org.semanticweb.owlapi.io.UnparsableOntologyException: Problem parsing file:/home/faessler/Coding/workspace/bioportal-ontology-tools/LOINC.ttl
Could not parse ontology.  Either a suitable parser could not be found, or parsing failed.  See parser logs below for explanation.
The following parsers were tried:
1) org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser@3b9d6699
2) org.semanticweb.owlapi.owlxml.parser.OWLXMLParser@2ad3a1bb
3) org.semanticweb.owlapi.functional.parser.OWLFunctionalSyntaxOWLParser@120f38e6
4) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
5) org.semanticweb.owlapi.manchestersyntax.parser.ManchesterOWLSyntaxOntologyParser@3ad394e6
6) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NQuadsDocumentFormatFactory@6f9c39ad
7) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonDocumentFormatFactory@cd748dc3
8) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.NTriplesDocumentFormatFactory@937ecd36
9) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.TrigDocumentFormatFactory@27e81c
10) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFJsonLDDocumentFormatFactory@dcacc47d
11) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.N3DocumentFormatFactory@9a5
12) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioRDFXMLDocumentFormatFactory@69b9a3bc
13) org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser@5b43e173
14) org.semanticweb.owlapi.rio.RioTrixParserFactory$TrixParserImpl : org.semanticweb.owlapi.formats.TrixDocumentFormatFactory@27e82d
15) org.semanticweb.owlapi.oboformat.OBOFormatOWLAPIParser@13cda7c9
16) org.semanticweb.owlapi.dlsyntax.parser.DLSyntaxOWLParser@1da6ee17
17) org.semanticweb.owlapi.krss2.parser.KRSS2OWLParser@253c1256
18) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.BinaryRDFDocumentFormatFactory@3bf24493
19) org.coode.owlapi.obo12.parser.OWLOBO12Parser@c827db
20) org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RDFaDocumentFormatFactory@264e8d


Detailed logs:
--------------------------------------------------------------------------------

SNIP

--------------------------------------------------------------------------------
Parser: org.semanticweb.owlapi.rio.RioParserImpl : org.semanticweb.owlapi.formats.RioTurtleDocumentFormatFactory@95fd655c
    Stack trace:
org.openrdf.rio.UnsupportedRDFormatException: Did not recognise RDF format object Turtle (mimeTypes=text/turtle, application/x-turtle; ext=ttl)        org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:138)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:175)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:997)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:961)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:910)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:922)
        de.julielab.bioportal.ontologies.apps.Test.main(Test.java:43)
Did not recognise RDF format object Turtle (mimeTypes=text/turtle, application/x-turtle; ext=ttl)        org.openrdf.rio.Rio.lambda$unsupportedFormat$0(Rio.java:630)
        java.util.Optional.orElseThrow(Optional.java:290)
        org.openrdf.rio.Rio.createParser(Rio.java:119)
        org.semanticweb.owlapi.rio.RioParserImpl.parseDocumentSource(RioParserImpl.java:173)
        org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:125)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:175)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:997)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:961)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:910)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:922)


--------------------------------------------------------------------------------

Parser: org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser@5b43e173
    Stack trace:
org.semanticweb.owlapi.rdf.turtle.parser.ParseException: Encountered " <PN_CHARS> "- "" at line 3635316, column 64.
Was expecting:
    "." ...
            org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser.parse(TurtleOntologyParser.java:60)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:175)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:997)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:961)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:910)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:922)
        de.julielab.bioportal.ontologies.apps.Test.main(Test.java:43)
Encountered " <PN_CHARS> "- "" at line 3635316, column 64.
Was expecting:
    "." ...
            org.semanticweb.owlapi.rdf.turtle.parser.TurtleParser.generateParseException(TurtleParser.java:1960)
        org.semanticweb.owlapi.rdf.turtle.parser.TurtleParser.jj_consume_token(TurtleParser.java:1829)
        org.semanticweb.owlapi.rdf.turtle.parser.TurtleParser.parseDocument(TurtleParser.java:111)
        org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser.parse(TurtleOntologyParser.java:56)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:175)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:997)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:961)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:910)
        uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:922)
        de.julielab.bioportal.ontologies.apps.Test.main(Test.java:43)



SNIP

Protégé可以正常加载文件,甚至可以直接使用TurtleParser,如

    java.net.URL documentUrl = new File("LOINC.ttl").toURI().toURL();
    InputStream inputStream = documentUrl.openStream();
    RDFParser rdfParser = new TurtleParser();
    java.util.ArrayList myList = new ArrayList();
    StatementCollector collector = new StatementCollector(myList);
    rdfParser.setRDFHandler(collector);
    try {
        rdfParser.parse(inputStream, documentUrl.toString());
    } catch (IOException | RDFParseException | RDFHandlerException e) {
        e.printStackTrace();
    }

贯穿始终。但是,我依赖OWL-API。

我认为没有语法错误,因为Protégé可以在不抱怨的情况下打开文件(日志中没有什么特别之处)。我也尝试过缩短版本的文件,因为它相当大。使用大约一半的文件工作。但我没有找到任何有关OWL-API长度限制的信息。然后再次。 Protégé可以打开它。

与BioPortal上的MESH.ttl和PDQ.ttl文件相同。然而,NCBITAXON.ttl可以工作。

OWL-API版本为5.0.5,Protege 5.0beta for Mac用于成功打开文件。

我非常感谢任何提示,因为现在我真的不知道这是什么问题。

谢谢!

1 个答案:

答案 0 :(得分:0)

除了可用的内存和集合的大小之外,owl api没有明确的限制 - 这限制为整数可以假设的最大值。

此解析器的详细日志是什么?

13) org.semanticweb.owlapi.rdf.turtle.parser.TurtleOntologyParser@5b43e173

这是owlapi本身的乌龟解析器,而不是Rio解析器。

另外,你使用的是哪个版本的owlapi和protégé?

EDiT:有了解析器的错误消息,我找到了失败的行:

<http://purl.bioontology.org/ontology/LNC/LRN2> """3'''-acetate; [cut]"""^^xsd:string ;

问题是三个引号:'''这些引号被解释为等同于""",它是文字分隔符。较旧的OWLAPI在此中的规格表现不正确,但看起来这个文字格式不正确,因此它与较新的OWLAPI失败。 Protege Beta最多(我认为)版本15使用OWLAPI 3.5,它有一个较旧的解析器。

我不确定在解析器中是否可以纠正这个问题,或者在此阶段是否需要修复数据。我会在GitHub上提出一个问题。 https://github.com/owlcs/owlapi/issues/610

第二次编辑:这是一个错误;文字本应该被正确解析。请参阅the relevant Turtle specs