我使用.From()和.Size()方法从Elastic Search结果中检索所有文档。
以下是示例 -
ISearchResponse<dynamic> bResponse = ObjElasticClient.Search<dynamic>(s => s.From(0).Size(25000).Index("accounts").AllTypes().Query(Query));
最近我遇到了弹性搜索的滚动功能。这看起来比专门用于获取大数据的From()和Size()方法更好。
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
我在NEST API中寻找Scroll功能的例子。
有人可以提供NEST示例吗?
谢谢, Sameer
答案 0 :(得分:11)
这是使用NEST和C#滚动的示例。适用于5.x和6.x
public IEnumerable<T> GetAllDocumentsInIndex<T>(string indexName, string scrollTimeout = "2m", int scrollSize = 1000) where T : class
{
ISearchResponse<T> initialResponse = this.ElasticClient.Search<T>
(scr => scr.Index(indexName)
.From(0)
.Take(scrollSize)
.MatchAll()
.Scroll(scrollTimeout));
List<T> results = new List<T>();
if (!initialResponse.IsValid || string.IsNullOrEmpty(initialResponse.ScrollId))
throw new Exception(initialResponse.ServerError.Error.Reason);
if (initialResponse.Documents.Any())
results.AddRange(initialResponse.Documents);
string scrollid = initialResponse.ScrollId;
bool isScrollSetHasData = true;
while (isScrollSetHasData)
{
ISearchResponse<T> loopingResponse = this.ElasticClient.Scroll<T>(scrollTimeout, scrollid);
if (loopingResponse.IsValid)
{
results.AddRange(loopingResponse.Documents);
scrollid = loopingResponse.ScrollId;
}
isScrollSetHasData = loopingResponse.Documents.Any();
}
this.ElasticClient.ClearScroll(new ClearScrollRequest(scrollid));
return results;
}
来自:http://telegraphrepaircompany.com/elasticsearch-nest-scroll-api-c/
答案 1 :(得分:5)
NEST Reindex
的内部实现使用scroll将文档从一个索引移动到另一个索引。
这应该是一个很好的起点。
以下您可以从github找到有趣的代码。
var page = 0;
var searchResult = this.CurrentClient.Search<T>(
s => s
.Index(fromIndex)
.AllTypes()
.From(0)
.Size(size)
.Query(this._reindexDescriptor._QuerySelector ?? (q=>q.MatchAll()))
.SearchType(SearchType.Scan)
.Scroll(scroll)
);
if (searchResult.Total <= 0)
throw new ReindexException(searchResult.ConnectionStatus, "index " + fromIndex + " has no documents!");
IBulkResponse indexResult = null;
do
{
var result = searchResult;
searchResult = this.CurrentClient.Scroll<T>(s => s
.Scroll(scroll)
.ScrollId(result.ScrollId)
);
if (searchResult.Documents.HasAny())
indexResult = this.IndexSearchResults(searchResult, observer, toIndex, page);
page++;
} while (searchResult.IsValid && indexResult != null && indexResult.IsValid && searchResult.Documents.HasAny());
另外,您可以查看Scroll
[Test]
public void SearchTypeScan()
{
var scanResults = this.Client.Search<ElasticsearchProject>(s => s
.From(0)
.Size(1)
.MatchAll()
.Fields(f => f.Name)
.SearchType(SearchType.Scan)
.Scroll("2s")
);
Assert.True(scanResults.IsValid);
Assert.False(scanResults.FieldSelections.Any());
Assert.IsNotNullOrEmpty(scanResults.ScrollId);
var results = this.Client.Scroll<ElasticsearchProject>(s=>s
.Scroll("4s")
.ScrollId(scanResults.ScrollId)
);
var hitCount = results.Hits.Count();
while (results.FieldSelections.Any())
{
Assert.True(results.IsValid);
Assert.True(results.FieldSelections.Any());
Assert.IsNotNullOrEmpty(results.ScrollId);
var localResults = results;
results = this.Client.Scroll<ElasticsearchProject>(s=>s
.Scroll("4s")
.ScrollId(localResults.ScrollId));
hitCount += results.Hits.Count();
}
Assert.AreEqual(scanResults.Total, hitCount);
}
答案 2 :(得分:0)
我自由地将Michael的好答案改写为async,并且不再那么冗长(v。6.x Nest):
public async Task<IEnumerable<T>> RockAndScroll<T>(
string indexName,
string scrollTimeoutMilliseconds = "2m",
int scrollPageSize = 1000
) where T : class
{
var searchResponse = await this.ElasticClient.SearchAsync<T>(sd => sd
.Index(indexName)
.From(0)
.Take(scrollPageSize)
.MatchAll()
.Scroll(scrollTimeoutMilliseconds));
var results = new List<T>();
while (true)
{
if (!searchResponse.IsValid || string.IsNullOrEmpty(searchResponse.ScrollId))
throw new Exception($"Search error: {searchResponse.ServerError.Error.Reason}");
if (!searchResponse.Documents.Any())
break;
results.AddRange(searchResponse.Documents);
searchResponse = await ElasticClient.ScrollAsync<T>(scrollTimeoutMilliseconds, searchResponse.ScrollId);
}
await this.ElasticClient.ClearScrollAsync(new ClearScrollRequest(searchResponse.ScrollId));
return results;
}