TLDR:转到底部
我正在研究Solr 4.10.1(solr-impl 4.10.1 1627268 - mike - 2014-09-24 06:07:51)有3个服务器和3个碎片设置。据我所知,有一个动物园管理员参与其中。
架构包含产品和价格。产品数据由一个作业插入/更新,价格字段开始不存在于提交的文档中。它们由一个只有product-id和price字段的单独作业添加。为此,设置updateRequestProcessorChain
:
<updateRequestProcessorChain name="versionable_chain" default="false">
<processor class="solr.DocBasedVersionConstraintsProcessorFactory">
<str name="versionField">price_last_generation_id</str>
<bool name="ignoreOldUpdates">true</bool>
</processor>
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
云布局如下。我简化了名字。编号无关:
/-- box-3 (active)
/- shard1--+-- box-1 (active leader)
| \-- box-2 (down)
|
| /-- box-2 (down)
- foo -+- shard2--+-- box-3 (active)
| \-- box-1 (active leader)
|
| /-- box-1 (active leader)
\- shard3--+-- box-3 (active)
\-- box-2 (down)
我目前还不确定为什么box-2
会显示下来,但我非常确定当该框仍处于活动状态并且我正在那里访问前端时问题已经显现。关于如何解决这个问题的评论将不胜感激。
目前的数据大小如下所示。我从所有三个方框的管理网页前端收集了信息。
在box-2
:
foo_shard1_replica3:
Last Modified:2 months ago (2014-11-30)
Num Docs:234044
Max Doc:262311
Heap Memory Usage:1652112
Deleted Docs:28267
Version:5893
Segment Count:10
foo_shard2_replica3:
Last Modified:2 months ago (2014-11-30)
Num Docs:303025
Max Doc:324491
Heap Memory Usage:1886264
Deleted Docs:21466
Version:7317
Segment Count:11
foo_shard3_replica3:
Last Modified:2 months ago (2014-11-30)
Num Docs:349651
Max Doc:397699
Heap Memory Usage:1895080
Deleted Docs:48048
Version:8893
Segment Count:12
在box-1
:
foo_shard1_replica1:
Last Modified:7 days ago
Num Docs:299185
Max Doc:348179
Heap Memory Usage:1920704
Deleted Docs:48994
Version:23067
Segment Count:11
foo_shard2_replica1:
Last Modified:7 days ago
Num Docs:379024
Max Doc:443322
Heap Memory Usage:2119024
Deleted Docs:64298
Version:26871
Segment Count:12
foo_shard3_replica1:
Last Modified:7 days ago
Num Docs:373670
Max Doc:414497
Heap Memory Usage:2130464
Deleted Docs:40827
Version:29925
Segment Count:12
在box-3
:
foo_shard1_replica2:
Last Modified:7 days ago
Num Docs:299185
Max Doc:314353
Heap Memory Usage:1878904
Deleted Docs:15168
Version:22740
Segment Count:11
foo_shard2_replica2:
Last Modified:7 days ago
Num Docs:379024
Max Doc:389958
Heap Memory Usage:2044384
Deleted Docs:10934
Version:26338
Segment Count:12
foo_shard3_replica2:
Last Modified:7 days ago
Num Docs:373670
Max Doc:402724
Heap Memory Usage:2127984
Deleted Docs:29054
Version:29598
Segment Count:12
当我运行特定文档的查询并返回文档ID和价格时,我有时会获得两个字段,有时只获得id。
GET http://box-2:8080/solr/foo_shard1_replica3/select?q=id%3A%22product-id%22&fl=id%2Cprice&wt=json&indent=true&debug=track
查询会产生两个不同的响应主体。这个有价格:
{
"response": {
"numFound": 1,
"start": 0,
"maxScore": 13.212859,
"docs": [
{
"id": "product-id",
"price": 174.8
}
]
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {}
},
"debug": {
"track": {
"rid": "box-2.internal-foo_shard1_replica3-1423579755522-18",
"EXECUTE_QUERY": {
"http://box-3.internal:8080/solr/foo_shard2_replica2/|http://box-1.internal:8080/solr/foo_shard2_replica1/": {
"ElapsedTime": "4",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "0",
"Response": "{response={numFound=0,start=0,maxScore=0.0,docs=[]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
},
"http://box-3.internal:8080/solr/foo_shard1_replica2/|http://box-1.internal:8080/solr/foo_shard1_replica1/": {
"ElapsedTime": "4",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,maxScore=12.965124,docs=[SolrDocument{id=product-id, score=12.965124}]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
},
"http://box-1.internal:8080/solr/foo_shard3_replica1/|http://box-3.internal:8080/solr/foo_shard3_replica2/": {
"ElapsedTime": "4",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,maxScore=13.212859,docs=[SolrDocument{id=product-id, score=13.212859}]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
}
},
"GET_FIELDS": {
"http://box-3.internal:8080/solr/foo_shard1_replica2/|http://box-1.internal:8080/solr/foo_shard1_replica1/": {
"ElapsedTime": "2",
"RequestPurpose": "GET_FIELDS,GET_DEBUG",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,docs=[SolrDocument{id=product-id, price=174.8}]},debug={}}"
}
}
}
}
}
这个没有:
{
"response": {
"numFound": 1,
"start": 0,
"maxScore": 13.2416725,
"docs": [
{
"id": "product-id"
}
]
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {}
},
"debug": {
"track": {
"rid": "box-2.internal-foo_shard1_replica3-1423579848055-20",
"EXECUTE_QUERY": {
"http://box-3.internal:8080/solr/foo_shard2_replica2/|http://box-1:8080/solr/foo_shard2_replica1/": {
"ElapsedTime": "3",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "0",
"Response": "{response={numFound=0,start=0,maxScore=0.0,docs=[]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
},
"http://box-1:8080/solr/foo_shard3_replica1/|http://box-3.internal:8080/solr/foo_shard3_replica2/": {
"ElapsedTime": "2",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,maxScore=13.2416725,docs=[SolrDocument{id=product-id, score=13.2416725}]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
},
"http://box-3.internal:8080/solr/foo_shard1_replica2/|http://box-1:8080/solr/foo_shard1_replica1/": {
"ElapsedTime": "4",
"RequestPurpose": "GET_TOP_IDS,GET_FACETS",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,maxScore=12.965124,docs=[SolrDocument{id=product-id, score=12.965124}]},sort_values={},facet_counts={facet_queries={},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}},debug={}}"
}
},
"GET_FIELDS": {
"http://box-1:8080/solr/foo_shard3_replica1/|http://box-3.internal:8080/solr/foo_shard3_replica2/": {
"ElapsedTime": "2",
"RequestPurpose": "GET_FIELDS,GET_DEBUG",
"NumFound": "1",
"Response": "{response={numFound=1,start=0,docs=[SolrDocument{id=product-id}]},debug={}}"
}
}
}
}
}
当数据插入作业正在运行时,我已经看过所有三台机器的访问日志。插入物或多或少均匀分布。所有机器都有一些。
在可以从Web前端访问的日志中,我发现了一些错误,虽然这些错误已经过了几天但之前没有发生过,尽管每隔几个小时就会有定期更新过程。
在通过box-1
访问的前端:
1/31/2015, 8:14:51 AM
ERROR
StreamingSolrServers
error
org.apache.solr.common.SolrException: Internal Server Error
request: http://box-1.internal:8080/solr/foo_shard2_replica1/update?update.chain=versionable_chain&update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fbox-1.internal%3A8080%2Fsolr%2Ffoo_shard1_replica1%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
在通过box-2
访问的前端:
1/31/2015, 8:15:13 AM
ERROR
StreamingSolrServers
error
org.apache.solr.common.SolrException: Internal Server Error
request: http://box-1.internal:8080/solr/foo_shard3_replica1/update?update.chain=versionable_chain&update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fbox-2.internal%3A8080%2Fsolr%2Ffoo_shard1_replica3%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
在通过box-3
访问的前端:
2015年1月31日,上午8:13:02
ERROR
SolrDispatchFilter
null:org.apache.solr.common.SolrException: Internal Server Error
null:org.apache.solr.common.SolrException: Internal Server Error
request: http://box-1.internal:8080/solr/foo_shard2_replica1/update?update.chain=versionable_chain&update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fbox-3.internal%3A8080%2Fsolr%2Ffoo_shard1_replica2%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
在box-1
我还有一个关于更新处理器的问题。我不知道这是否相关。
1/31/2015, 8:15:05 AM
ERROR
SolrDispatchFilter
null:org.apache.solr.common.SolrException: Doc exists in index, but has null versionField: price_last_generation_id
null:org.apache.solr.common.SolrException: Doc exists in index, but has null versionField: price_last_generation_id
at org.apache.solr.update.processor.DocBasedVersionConstraintsProcessorFactory$DocBasedVersionConstraintsProcessor.isVersionNewEnough(DocBasedVersionConstraintsProcessorFactory.java:328)
at org.apache.solr.update.processor.DocBasedVersionConstraintsProcessorFactory$DocBasedVersionConstraintsProcessor.processAdd(DocBasedVersionConstraintsProcessorFactory.java:399)
at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:99)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
在写这篇文章时我已经看到复制可能无法正常工作,这可能就是问题所在。但是,这个错误首先在复制开始分歧之前3周报告。
所以问题是:为什么有些文档在所有节点上都没有完成?
合理的后续行动:如何解决这个问题?
我继承了这个系统,我很高兴评论我可以做些什么来隔离原因。