我想使用以下语句使用spark-cassandra-connector查询cassandra表:
sc.cassandraTable("citizens","records")
.select("identifier","name")
.where( "name='Alice' or name='Bob' ")
我收到此错误消息:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 81.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 81.0 (TID 9199, mydomain):
java.io.IOException: Exception during preparation of
SELECT "identifier", "name" FROM "citizens"."records" WHERE token("id") > ? AND token("id") <= ? AND name='Alice' or name='Bob' LIMIT 10 ALLOW FILTERING:
line 1:127 missing EOF at 'or' (...<= ? AND name='Alice' [or] name...)
我在这里做错了什么以及如何使用连接器的or
子句进行where
查询?
答案 0 :(得分:1)
您的OR
子句无效CQL。对于这几个关键值(我假设name
是关键),您可以使用IN
子句。
.where( "name in ('Alice', 'Bob') ")
where
子句用于将CQL
下推到Cassandra,因此只有有效CQL
可以进入其中。如果您希望执行Spark Side Sql-Like语法,请查看SparkSql和Datasets。