我最近开始使用Fuseki 0.2.8快照中的全文搜索。
我有一个由TDB数据集支持的InfModel,我已经添加了一个Lucene文本索引。我已经使用这样的搜索查询对其进行了测试:
prefix text: <http://jena.apache.org/text#>
select distinct ?s where { ?s text:query ('stu' 16) }
这很好用,直到我和Fuseki有两个或两个以上的同时查询,然后偶尔会得到:
Error 500: Currently in a locked region Fuseki - version 0.2.8-SNAPSHOT (Build date: 20130820-0755).
我尝试用10个并发用户以随机间隔发送查询来测试端点,在两分钟内,大约30%的查询返回上面的500错误。
我还尝试通过替换此部分(下面的完整汇编程序文件)来禁用推理:
<#dataset_fulltext> rdf:type text:TextDataset ;
text:dataset <#dataset_inf> ;
##text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
用这个:
<#dataset_fulltext> rdf:type text:TextDataset ;
##text:dataset <#dataset_inf> ;
text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
并且TextDataset使用#tdDDataset而不是#dataset_inf时没有生成异常。
我的设置有问题,或者这是Fuseki的错误吗?
这是我当前的汇编程序文件:
@prefix : <#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text: <http://jena.apache.org/text#> .
@prefix dc: <http://purl.org/dc/terms/> .
[] rdf:type fuseki:Server ;
# Timeout - server-wide default: milliseconds.
# Format 1: "1000" -- 1 second timeout
# Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest of query.
# See java doc for ARQ.queryTimeout
ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "12000,50000" ] ;
fuseki:services (
<#service1>
) .
# Custom code.
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
# TDB
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
## Initialize text query
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset rdfs:subClassOf ja:RDFDataset .
# Lucene index
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model bbase data in TDB.
<#service1> rdf:type fuseki:Service ;
rdfs:label "TDB/text service" ;
fuseki:name "dataset" ; # http://host/dataset
fuseki:serviceQuery "query" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ;
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset <#dataset_fulltext> ;
.
<#dataset_inf> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model_inf> .
<#model_inf> rdf:type ja:Model ;
ja:baseModel <#tdbGraph> ;
ja:reasoner [ ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner> ] .
<#tdbDataset> rdf:type tdb:DatasetTDB ;
tdb:location "Data" .
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:dataset <#tdbDataset> .
# Dataset with full text index.
<#dataset_fulltext> rdf:type text:TextDataset ;
text:dataset <#dataset_inf> ;
##text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
# Text index description
<#indexLucene> a text:TextIndexLucene ;
text:directory <file:Lucene> ;
##text:directory "mem" ;
text:entityMap <#entMap> ;
.
# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "text" ;
text:map (
[ text:field "text" ; text:predicate dc:title ]
[ text:field "text" ; text:predicate dc:description ]
) .
以下是Fuseki日志中其中一个例外的完整堆栈跟踪:
16:27:01 WARN Fuseki :: [2484] RC = 500 : Currently in a locked region
com.hp.hpl.jena.sparql.core.DatasetGraphWithLock$JenaLockException: Currently in a locked region
at com.hp.hpl.jena.sparql.core.DatasetGraphWithLock.checkNotActive(DatasetGraphWithLock.java:72)
at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.begin(DatasetGraphTrackActive.java:44)
at org.apache.jena.query.text.DatasetGraphText.begin(DatasetGraphText.java:102)
at org.apache.jena.fuseki.servlets.HttpAction.beginRead(HttpAction.java:117)
at org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:236)
at org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:195)
at org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:80)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeLifecycle(SPARQL_ServletBase.java:185)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeAction(SPARQL_ServletBase.java:166)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.execCommonWorker(SPARQL_ServletBase.java:154)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:73)
at org.apache.jena.fuseki.servlets.SPARQL_Query.doGet(SPARQL_Query.java:61)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:298)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
任何建议表示赞赏。
谢谢, 斯图尔特。
答案 0 :(得分:1)
这看起来可能是我提交为JENA-522的错误,如果您有关于要添加的错误的更多详细信息,请在那里添加评论。
问题是带有推理的数据集隐式使用ARQ的标准内存中Dataset
实现,这不支持事务。
然而,在内部(和堆栈跟踪中)对应DatasetGraphText
的文本数据集要求包装数据集支持事务,并且不用DatasetGraphWithLock
包装它们。这似乎是遇到锁的问题,文档声明这应该支持多个读者,但遵循代码的逻辑我不确定它实际上允许这个。