我定义了一个类来映射cassandra表的行:
case class Log(
val time: Long,
val date: String,
val appId: String,
val instanceId: String,
val appName: String,
val channel: String,
val originCode: String,
val message: String) {
}
我创建了一个RDD来保存我的所有元组
val logEntries = sc.cassandraTable[Log]("keyspace", "log")
看看我是否所有作品都打印出来了:
println(logEntries.counts()) -> works, print the numbers of tuples retrieved.
println(logEntries.first()) -> exception on this line
java.lang.AssertionError:断言失败:缺少所需的列 com.model.Log:app_name,app_id,origin_code,instance_id
我在cassandra上登录的表格列是:
time bigint, date text, appid text, instanceid text, appname text, channel text, origincode text, message text
出了什么问题?
答案 0 :(得分:2)
As mentioned in cassandra-spark-connector docs, column name mapper has it's own logic for converting case class parameters to column names:
For multi-word column identifiers, separate each word by an underscore in Cassandra, and use the camel case convention on the Scala side.
So if you use case class Log(appId:String, instanceId:String)
with camel-cased parameters, it will be automatically mapped to a underscore-separated notation: app_id text, instance_id text
. It cannot be automatically mapped to appid text, instanceid text
: you've missed an underscore.