Question

I'm using ArangoDB for a Web Application through Strongloop. I've got some performance problem when I run this query:

FOR result IN Collection SORT result.field ASC RETURN result

I added some index to speed up the query like skiplist index on the field sorted.

My Collection has inside more than 1M of records.

The application is hosted on n1-highmem-2 on Google Cloud. Below some specs:

Unluckly, my query spend a lot of time to ending. What can I do?

Best regards, Carmelo

Answer 1

总结上述讨论：

如果field属性上存在跳转列表索引，则可以将其用于排序。但是，如果它created sparse它不能。这可以通过运行

重新验证

db.Collection.getIndexes();

在ArangoShell中。如果索引存在且非稀疏，则查询应使用索引进行排序，并且不需要其他排序 - 可以重新验证using Explain。但是，查询仍然会在内存中产生巨大的结果，这需要时间并消耗RAM。

如果需要大的结果集，可以使用LIMIT以几个块的形式检索结果切片，这将减少机器上的压力。

例如，第一次迭代：

FOR result IN Collection SORT result.field LIMIT 10000 RETURN result

然后离线处理这些第一个 10,000 文档，并记下上次处理的文档的结果值。现在再次运行查询，但现在使用额外的FILTER：

FOR result IN Collection
  FILTER result.field > @lastValue LIMIT 10000 RETURN result

直到没有更多文件。如果result.field是唯一的，那应该可以正常工作。

如果result.field不唯一且跳过列表中没有其他唯一键，则所描述的方法至少是近似值。

另请注意，在将查询拆分为块时，这不会提供快照隔离，但根据用例，它已经足够好了。