我运行以下命令:
spark-shell --packages datastax:spark-cassandra-connector:1.6.0-s_2.10
然后我停止上下文:
sc.stop
然后我在REPL中运行此代码:
val conf = new org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)
cc.setKeyspace("ksp")
cc.sql("SELECT * FROM continents").registerTempTable("conts")
val allContinents = sqlContext.sql("SELECT * FROM conts").collect
我得到了:
org.apache.spark.sql.AnalysisException: Table not found: conts;
密钥空间ksp
和表continents
在Cassandra中定义,所以我怀疑错误不是来自那边。
(Spark 1.6.0,1.6.1)
答案 0 :(得分:1)
因为您使用不同的上下文来创建数据框并执行SQL。
val conf = new
org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)
cc.setKeyspace("ksp")
cc.sql("SELECT * FROM continents").registerTempTable("conts")
// use cc instead of sqlContext
val allContinents = cc.sql("SELECT * FROM conts").collect