RDF4J Workbench:为什么一个SPIN构造函数很慢?

时间:2016-08-29 18:35:56

标签: sparql rdf sesame topbraid-composer rdf4j

我为这篇文章的篇幅道歉。我试图让这个缓慢的规则问题重现。

我正在使用TopBraid Composer FE创建一个带有本体和SPIN构造函数的RDF文件。他的SPIN构造函数的目的是检查本体中定义的类的个体的实例化是否合规。我发现SPIN构造函数的执行速度非常慢,我想知道原因。

Ontology Including SPIN Constructors SXXIComplianceCheck18.rdf

我修改/清除我的存储库(具有RDFS + SPIN支持的内存存储)并将此本体加载到RDF4J工作台:RDF4J Workbench System Information

enter image description here

enter image description here

接下来,我按顺序使用两个SPARQL Update查询来创建本体中定义的类的个体(上面的RDF文件),从而激发运行SPIN构造函数。

第一个SPARQL更新查询(实例化单个数据项并根据需要调用解析构造函数...快速运行):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sxxicc: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheck#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX smf: <http://topbraid.org/sparqlmotionfunctions#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX spl: <http://spinrdf.org/spl#>
PREFIX spin: <http://spinrdf.org/spin#>
PREFIX arg: <http://spinrdf.org/arg#>
PREFIX SXXIComplianceCheckIndividuals: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX sxxicci: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>


INSERT DATA
{
   sxxicci:testPub7Proposal_DataItem110 a sxxicc:Pub7DataItem110 ;
           sxxicc:pub7DataItemHasRawStringValue "(C) M221.5"^^xsd:string .  

   sxxicci:testPub7Proposal_DataItem500 a sxxicc:Pub7DataItem500 ;
           sxxicc:pub7DataItemHasRawStringValue "S181"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem300 a sxxicc:Pub7DataItem300 ;
           sxxicc:pub7DataItemHasRawStringValue "DC"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem113 a sxxicc:Pub7DataItem113 ;
           sxxicc:pub7DataItemHasRawStringValue "FX"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem511 a sxxicc:Pub7DataItem511 ;
           sxxicc:pub7DataItemHasRawStringValue "SIGHT SEEING"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem501 a sxxicc:Pub7DataItem501 ;
#            sxxicc:pub7DataItemHasRawStringValue "M002"^^xsd:string .
           sxxicc:pub7DataItemHasRawStringValue "M018,     160727"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem503 a sxxicc:Pub7DataItem503 ;
           sxxicc:pub7DataItemHasRawStringValue "CUBESAT, TOM"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem504a a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 1"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "1"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504b a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 2"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "2"^^xsd:integer .           

   sxxicci:testPub7Proposal_DataItem504c a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 3"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "3"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504d a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 4"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "4"^^xsd:integer .         

   sxxicci:testPub7Proposal_DataItem504e a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 5"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "5"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504f a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 6"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "6"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem144 a sxxicc:Pub7DataItem144 ;
           sxxicc:pub7DataItemHasRawStringValue "Y"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem005 a sxxicc:Pub7DataItem005 ;
           sxxicc:pub7DataItemHasRawStringValue "SF"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem102 a sxxicc:Pub7DataItem102 ;
           sxxicc:pub7DataItemHasRawStringValue "S   881234"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem017 a sxxicc:Pub7DataItem017 ;
           sxxicc:pub7DataItemHasRawStringValue "C"^^xsd:string .
}

第二个SPARQL更新查询(实例化将第一个查询实例化的数据项绑定在一起并运行合规性检查构造函数的提案...在我的计算机上运行非常缓慢,大约20秒):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sxxicc: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheck#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX smf: <http://topbraid.org/sparqlmotionfunctions#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX spl: <http://spinrdf.org/spl#>
PREFIX spin: <http://spinrdf.org/spin#>
PREFIX arg: <http://spinrdf.org/arg#>
PREFIX SXXIComplianceCheckIndividuals: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX sxxicci: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

INSERT DATA
{

   sxxicci:TestPub7Proposal a sxxicc:Pub7Proposal ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem005 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem017 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem102 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem110 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem113 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem144 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem300 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem500 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem501 ; 
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem503 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504a ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504b ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504c ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504d ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504e ;
#           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504f ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem511 .


}

第二个查询需要很长时间才能执行,大约20秒。这与其他合规性检查(不包括在此RDF中)不一致。我已经从其他13个类似的规则(主要是字符串解析和比较)中隔离了这一条规则,因为它主宰了时间消耗。

(正确但延迟)结果:

enter image description here

有问题的SPIN构造函数(对于sxxicc:Pub7Proposal类):

# NEED MINUTE <M> NOTE M002 IN CIRCUIT REMARKS LISTING IRAC DOCUMENT GIVING APPROVAL FOR THIS ASSIGNMENT. (501 08)
CONSTRUCT {
    ?this sxxicc:pub7ProposalHasComplianceMessage "NEED MINUTE <M> NOTE M002 IN CIRCUIT REMARKS LISTING IRAC DOCUMENT GIVING APPROVAL FOR THIS ASSIGNMENT. (501 08)"^^xsd:string .
}
WHERE {
    ?this a sxxicc:Pub7Proposal .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem102 .
    ?dataItem102 a sxxicc:Pub7DataItem102 .
    ?dataItem102 sxxicc:pub7DataItemHasStringValue ?serialNumString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem500 .
    ?dataItem500 a sxxicc:Pub7DataItem500 .
    ?dataItem500 sxxicc:pub7DataItemHasStringValue ?iracNotesString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem501 .
    ?dataItem501 a sxxicc:Pub7DataItem501 .
    ?dataItem501 sxxicc:pub7DataItemHasStringValue ?notesFreeTextCommentsString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem113 .
    ?dataItem113 a sxxicc:Pub7DataItem113 .
    ?dataItem113 sxxicc:pub7DataItemHasStringValue ?stationClassString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem300 .
    ?dataItem300 a sxxicc:Pub7DataItem300 .
    ?dataItem300 sxxicc:pub7DataItemHasStringValue ?stateCountryForTransmittingStation .
    BIND (SUBSTR(?serialNumString, 1, 4) AS ?orgString) .
    BIND (SUBSTR(?stationClassString, 1, 2) AS ?stationClassCode) .
    FILTER (((((?orgString = "S   "^^xsd:string) && (?iracNotesString = "S181"^^xsd:string)) && (?notesFreeTextCommentsString != "M002"^^xsd:string)) && (?stationClassCode = "FX"^^xsd:string)) && (?stateCountryForTransmittingStation = "DC"^^xsd:string)) .
}

为什么这个构造函数在现代PC上运行得如此之慢(AMD四核2.3 GHz运行Windows 8,物理内存为16 GB,没有显着的额外应用程序加载)?其他构造函数在同一台机器上快速运行,并使用相同的事实做类似的事情。

这是执行此示例的Jave VisualVM Sampler输出:

enter image description here

RDF4J org.eclipse.rdf4j.common.concurrent.locks.LockManager $ 1.release()和org.eclipse.rdf4j.common.concurrent.locks.LockManager.createLock()支配自我时间。为什么??我可以做些什么来重写我的规则以避免这次消费吗?

注意:

  1. 在SPIN构造函数中不需要WHERE子句的第一个三元组,因为?这是自动设置的。但是,我将其包含在内,以便通过将此构造函数复制到工作台中的SPARQL查询(Explore / Query)来简化调试。我还发现用“SELECT DISTINCT *”替换CONSTRUCT子句很方便,同时保留了WHERE子句以进行构造函数调试。
  2. 此构造函数中WHERE子句的唯一用途是提供图形模式匹配,以显示CONSTRUCT子句中存在的固定错误消息的错误条件。没有绑定从WHERE子句转移到CONSTRUCT子句中,但是WHERE子句仍然在CONSTRUCT子句中对三元组进行断言。
  3. 更新

    我通过从构造函数中删除一个FILTER和相关的三元组来修改构造函数:

    FILTER (?iracNotesString = "S181"^^xsd:string) .
    
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem500 .
    ?dataItem500 a sxxicc:Pub7DataItem500 .
    ?dataItem500 sxxicc:pub7DataItemHasStringValue ?iracNotesString .
    

    这导致下面显示的构造函数在TBC FE:

    enter image description here

    使用相同的2个SPARQL更新查询运行相同的测试,第二个查询的执行时间非常非线性地从20秒减少到2秒以下。再次,这似乎不对。

0 个答案:

没有答案