我有一个超过200万行的cassandra表。我需要获取结果并将其分页。 如何从选择查询中分页我的结果。
当我尝试检索1M行时,我得到了rpc超时。
答案 0 :(得分:1)
在cqlsh命令提示符下,一种方法是通过token
函数限制散列分区键值。让我们说我有一个跟踪船员的表(crewname
作为我的分区键):
aploetz@cqlsh:presentation> SELECT crewname,token(crewname),firstname,lastname
FROM crew;
crewname | token(crewname) | firstname | lastname
----------+----------------------+-----------+-----------
Simon | -8694467316808994943 | Simon | Tam
Jayne | -3415298744707363779 | Jayne | Cobb
Wash | 596395343680995623 | Hoban | Washburne
Mal | 4016264465811926804 | Malcolm | Reynolds
Zoey | 7853923060445977899 | Zoey | Washburne
Sheppard | 8386579365973272775 | Derial | Book
(6 rows)
如果我只想将所有船员从Jayne带回Zoey(包括),我可以运行这样的查询:
aploetz@cqlsh:presentation> SELECT crewname,token(crewname),firstname,lastname
FROM crew WHERE token(crewname) >= token('Jayne') AND token(crewname) <= token('Zoey');
crewname | token(crewname) | firstname | lastname
----------+----------------------+-----------+-----------
Jayne | -3415298744707363779 | Jayne | Cobb
Wash | 596395343680995623 | Hoban | Washburne
Mal | 4016264465811926804 | Malcolm | Reynolds
Zoey | 7853923060445977899 | Zoey | Washburne
(4 rows)
您也应该能够使用分区键值执行类似操作。
否则,您可以使用其中一个驱动程序完成此操作。在她的文章Things You Should Be Doing When Using Cassandra Drivers中,DataStax的Rebecca Mills描述了如何使用setFetchSize
翻译大型结果集(她的示例如下):
Statement stmt = new SimpleStatement("select * FROM raw_weather_data WHERE wsid= '725474:99999' AND year = 2005 AND month = 6");
stmt.setFetchSize(24);
ResultSet rs = session.execute(stmt);
Iterator<Row> iter = rs.iterator();
while (!rs.isFullyFetched()) {
rs.fetchMoreResults();
Row row = iter.next();
System.out.println(row);
}