Question

我定义了一个类来映射cassandra表的行：

case class Log(
    val time: Long,
    val date: String,
    val appId: String,
    val instanceId: String,
    val appName: String,
    val channel: String,
    val originCode: String,
    val message: String) {
}

我创建了一个RDD来保存我的所有元组

 val logEntries = sc.cassandraTable[Log]("keyspace", "log")

看看我是否所有作品都打印出来了：

println(logEntries.counts()) -> works, print the numbers of tuples retrieved.
println(logEntries.first()) -> exception on this line

java.lang.AssertionError：断言失败：缺少所需的列 com.model.Log：app_name，app_id，origin_code，instance_id

我在cassandra上登录的表格列是：

time bigint, date text, appid text, instanceid text, appname text, channel text, origincode text, message text

出了什么问题？

Answer 1

As mentioned in cassandra-spark-connector docs, column name mapper has it's own logic for converting case class parameters to column names:

For multi-word column identifiers, separate each word by an underscore in Cassandra, and use the camel case convention on the Scala side.

So if you use case class Log(appId:String, instanceId:String) with camel-cased parameters, it will be automatically mapped to a underscore-separated notation: app_id text, instance_id text. It cannot be automatically mapped to appid text, instanceid text: you've missed an underscore.

从cassandra检索数据的断言

1 个答案: