我正在尝试从cassandra键空间读取数据但是当尝试使用pyspark访问它时,Keyspace不可见。
使用DataStax spark shell =>
时有效ubuntu@ip-172-31-60-229:~$ sudo dse -u xxxx -p xxxxx spark --conf spark.driver.cores=4
The log file is at /home/ubuntu/.spark-shell.log
warning: there was one deprecation warning; re-run with -deprecation for details
New Spark Session
WARN 2017-08-28 07:02:21,286 org.apache.spark.SparkContext: Use an existing SparkContext, some configuration may not take effect.
Extracting Spark Context
Extracting SqlContext
Spark context Web UI available at http://xxx.xxx.xx.xx:4040
Spark context available as 'sc' (master = dse://?, app id = app-2017082807xxx1-0002).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.2.6
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql("show databases").show()
+----------------+
| databaseName |
+----------------+
| default |
| dsefs |
| OpsCenter |
|staging_keyspace|
+----------------+
但使用以下代码时,只能看到 default
。
from pyspark.sql import SparkSession
from pyspark.sql import SQLContext
import os
os.environ["DSE_USERNAME"] = 'xxx'
os.environ["DSE_PASSWORD"] = 'xxxx'
spark = SparkSession.builder.master("local").appName("SparkCassandraApp") \
.config("spark.cassandra.connection.host", "localhost") \
.config("spark.cassandra.connection.port", "9042") \
.getOrCreate()
sql_context = SQLContext(spark.sparkContext)
sql_context.sql('show databases').show()
spark.stop()
+------------+
| databaseName|
+------------+
| default |
+------------+