我已经从Cloudera安装了Spark 2.1。当我从中发射火花壳时 / usr / bin / spark2-shell它运行(使用scala)。当我启动Pyspark时,我遇到了这个问题
sudo -u hdfs ./pyspark2
我明白了:
java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool. Original Exception: ------
java.sql.SQLException: Failed to create database 'metastore_db', see the next exception for details.
......
Caused by: ERROR XBM0H: Directory /usr/bin/metastore_db cannot be created.
Caused by: java.sql.SQLException: Failed to create database
'metastore_db', see the next exception for details
.....
Caused by: ERROR XJ041: Failed to create database 'metastore_db', see the next exception for details.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
... 105 more
Caused by: ERROR XBM0H: Directory /usr/bin/metastore_db cannot be created.
Traceback (most recent call last):
File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/shell.py", line 43, in <module>
spark = SparkSession.builder\
File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
我认为从pyspark创建HiveContext时会出现问题。还有如何在不创建HiveContext的情况下运行pyspark。任何帮助,将不胜感激。