我在客户端模式下使用带有spark独立集群的python 2.7。
我想使用jdbc for mysql,发现我需要使用--jars
参数加载它,我在我的本地有jdbc,并设法使用像here这样的pyspark控制台加载它
当我在ide中编写python脚本时,使用pyspark,我无法加载额外的jar mysql-connector-java-5.1.26.jar
并继续
没有合适的驱动程序
错误
如何在客户端模式下运行python脚本,在客户端模式下使用独立群集并引用远程主服务器时加载其他jar文件?
编辑:添加了一些代码########################################## ############################### 这是我正在使用的基本代码,我在python中使用带有spark上下文的pyspark,例如我不直接使用spark submit并且不了解如何在这种情况下使用spark submit参数...
def createSparkContext(masterAdress = algoMaster):
"""
:return: return a spark context that is suitable for my configs
note the ip for the master
app name is not that important, just to show off
"""
from pyspark.mllib.util import MLUtils
from pyspark import SparkConf
from pyspark import SparkContext
import os
SUBMIT_ARGS = "--driver-class-path /var/nfs/general/mysql-connector-java-5.1.43 pyspark-shell"
#SUBMIT_ARGS = "--packages com.databricks:spark-csv_2.11:1.2.0 pyspark-shell"
os.environ["PYSPARK_SUBMIT_ARGS"] = SUBMIT_ARGS
conf = SparkConf()
#conf.set("spark.driver.extraClassPath", "var/nfs/general/mysql-connector-java-5.1.43")
conf.setMaster(masterAdress)
conf.setAppName('spark-basic')
conf.set("spark.executor.memory", "2G")
#conf.set("spark.executor.cores", "4")
conf.set("spark.driver.memory", "3G")
conf.set("spark.driver.cores", "3")
#conf.set("spark.driver.extraClassPath", "/var/nfs/general/mysql-connector-java-5.1.43")
sc = SparkContext(conf=conf)
print sc._conf.get("spark.executor.extraClassPath")
return sc
sql = SQLContext(sc)
df = sql.read.format('jdbc').options(url='jdbc:mysql://ip:port?user=user&password=pass', dbtable='(select * from tablename limit 100) as tablename').load()
print df.head()
由于
答案 0 :(得分:2)
从python创建sparkContext时,您的SUBMIT_ARGS
将传递给spark-submit
。您应该使用--jars
代替--driver-class-path
。
修改强>
您的问题实际上比看起来简单得多:您在选项中缺少参数driver
:
sql = SQLContext(sc)
df = sql.read.format('jdbc').options(
url='jdbc:mysql://ip:port',
user='user',
password='pass',
driver="com.mysql.jdbc.Driver",
dbtable='(select * from tablename limit 100) as tablename'
).load()
您还可以将user
和password
放在不同的参数中。