我正在使用Search :: Elasticsearch和Search :: Elasticsearch :: Scroll进行搜索并滚动到我的弹性搜索服务器。
在滚动过程中,对于某些查询,我在滚动搜索结果时看到了下一个错误:
2016/03/22 11:03:38 - 265885 FATAL: [Daemon.pm][8221]: Something gone wrong, error $VAR1 = bless( {
'msg' => '[Missing] ** [http://localhost:9200]-[404] Not Found, called from sub Search::Elasticsearch::Scroll::next at searcher.pl line 92. With vars: {\'body\' => {\'hits\' => {\'hits\' => [],\'max_score\' => \'0\',\'total\' => 5215},\'timed_out\' => bless( do{\\(my $o = 0)}, \'JSON::XS::Boolean\' ),\'_shards\' => {\'failures\' => [{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [4920053]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051485]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [4920059]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051496]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1},{\'index\' => undef,\'reason\' => {\'reason\' => \'No search context found for id [5051500]\',\'type\' => \'search_context_missing_exception\'},\'shard\' => -1}],\'failed\' => 5,\'successful\' => 0,\'total\' => 5},\'_scroll_id\' => \'c2NhbjswOzE7dG90YWxfaGl0czo1MjE1Ow==\',\'took\' => 2},\'request\' => {\'serialize\' => \'std\',\'path\' => \'/_search/scroll\',\'ignore\' => [],\'mime_type\' => \'application/json\',\'body\' => \'c2Nhbjs1OzQ5MjAwNTM6bHExbENzRDVReEc0OV9UMUgzd3Vkdzs1MDUxNDg1OnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7NDkyMDA1OTpscTFsQ3NENVF4RzQ5X1QxSDN3dWR3OzUwNTE0OTY6cmtDeWxSREpUdHFFZFZ5RGg5MHhZUTs1MDUxNTAwOnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7MTt0b3RhbF9oaXRzOjUyMTU7\',\'qs\' => {\'scroll\' => \'1m\'},\'method\' => \'GET\'},\'status_code\' => 404}
',
'stack' => [
[
'searcher.pl',
92,
'Search::Elasticsearch::Scroll::next'
]
],
'text' => '[http://localhost:9200]-[404] Not Found',
'vars' => {
'body' => {
'hits' => {
'hits' => [],
'max_score' => '0',
'total' => 5215
},
'timed_out' => bless( do{\(my $o = 0)}, 'JSON::XS::Boolean' ),
'_shards' => {
'failures' => [
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [4920053]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051485]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [4920059]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051496]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
},
{
'index' => undef,
'reason' => {
'reason' => 'No search context found for id [5051500]',
'type' => 'search_context_missing_exception'
},
'shard' => -1
}
],
'failed' => 5,
'successful' => 0,
'total' => 5
},
'_scroll_id' => 'c2NhbjswOzE7dG90YWxfaGl0czo1MjE1Ow==',
'took' => 2
},
'request' => {
'serialize' => 'std',
'path' => '/_search/scroll',
'ignore' => [],
'mime_type' => 'application/json',
'body' => 'c2Nhbjs1OzQ5MjAwNTM6bHExbENzRDVReEc0OV9UMUgzd3Vkdzs1MDUxNDg1OnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7NDkyMDA1OTpscTFsQ3NENVF4RzQ5X1QxSDN3dWR3OzUwNTE0OTY6cmtDeWxSREpUdHFFZFZ5RGg5MHhZUTs1MDUxNTAwOnJrQ3lsUkRKVHRxRWRWeURoOTB4WVE7MTt0b3RhbF9oaXRzOjUyMTU7',
'qs' => {
'scroll' => '1m'
},
'method' => 'GET'
},
'status_code' => 404
},
'type' => 'Missing'
}, 'Search::Elasticsearch::Error::Missing' );
我正在使用的代码是下一个(简化):
# Retrieve scroll
my $scroll = $self->getScrollBySignature($item);
# Retrieve all affected documents ids
while (my @docs = $scroll->next(500)) {
# Do stuff with @docs
}
函数getScrollBySignature具有下一个代码,以便调用elasticSearch
my $scroll = $self->{ELASTIC}->scroll_helper(
index => $self->{INDEXES},
search_type => 'scan',
ignore_unavailable => 1,
body => {
size => $self->{PAGINATION},
query => {
filtered => {
filter => {
bool => {
must => [{term => {signature_id => $item->{profileId}}}, {terms => {channel_type_id => $type}}]
}
}
}
}
}
);
正如你所看到的,我正在进行滚动而不传递滚动参数然后正如文档所说,滚动活着的时间是1分钟。
elasticSearch是一个由3个服务器组成的集群,以该错误结束的查询检索的文档数量超过5000个。
我的第一个解决方案是将滚动的生命周期更新为5分钟,并且没有出现错误。
问题是,正如我所知,每次我调用$ scroll-> next()时,受影响的滚动生命时间会升级1m,那么如何才能接收这些与上下文相关的错误?
我做得不好?
谢谢大家。
答案 0 :(得分:1)
首先想到的是计时器未更新。你检查过这个吗?例如,您可以每10秒进行一次查询,并查看是否在第6个查询中为您提供了错误...
答案 1 :(得分:0)
嗯,一个好的经验法则是在import org.apache.commons.lang3.text.WordUtils;
public class Main {
public static void main(String[] args) {
final String str1 = "HELLO WORLD";
System.out.println(capitalizeFirstLetter(str1)); // output: Hello World
final String str2 = "Hello WORLD";
System.out.println(capitalizeFirstLetter(str2)); // output: Hello World
final String str3 = "hello world";
System.out.println(capitalizeFirstLetter(str3)); // output: Hello World
final String str4 = "heLLo wORld";
System.out.println(capitalizeFirstLetter(str4)); // output: Hello World
}
private static String capitalizeFirstLetter(String str) {
return WordUtils.capitalizeFully(str);
}
}
区块内,不要超过你在滚动中配置的时间。
在->next()
的每次调用之间,您不能停留超过配置的时间。如果你留下更多,滚动可能不在那里,将出现错误->next()
。
我对这个问题的解决方案是在下一个块内部,只将数据存储到数组/哈希结构中,一旦滚动过程结束,就可以处理所有数据。
问题示例的解决方案:
earch_context_missing_exception