Question

我尝试使用scala语言在spark中创建查询，数据在cassandra数据库中以表格形式提供。在Cassandra表中，我有两个键，1）主键 2）分区键

Cassandra DDL将是这样的：

layout-large

我的Spark编程：

CREATE TABLE A.B (
    id1 text,
    id2 text,
    timing timestamp,
    value float,
    PRIMARY KEY ((id1, id2), timing)
) WITH CLUSTERING ORDER BY (timing DESC)

当我查询相同的＆＃34;值＆＃34;我正在获取结果，但是当我查询id1或id2时，我收到错误。

获得的错误： java.lang.UnsupportedOperationException：分区键谓词必须包括需要索引的所有分区键列或分区键列。缺少列：id2

我使用spark-2.2.0-bin-hadoop2.7，Cassandra 3.9，scala 2.11.8。

提前致谢。

Answer 1

我需要的输出是通过以下程序获得的。

val conf = new SparkConf(true).set("spark.cassandra.connection.host","192.168.xx.xxx").set("spark.cassandra.auth.username","test").set("spark.cassandra.auth.password","test")
val sc = new SparkContext(conf)
var ctable = sc.cassandraTable("A", "B").select("id1","id2","timing","value").where("id1=?","1001").where("id2=?","1002")

这是我们通过Spark访问cassandra数据库中的分区键的方法。

分区键谓词必须包括所有分区键列

1 个答案: