无法使用SSO电子钱包将Oracle与Apache Spark连接

时间:2018-12-04 15:39:18

标签: scala apache-spark oracle-sso oracle-wallet

我们正尝试使用末尾配置的SSO钱包和Apache Spark连接到作为AmazonRDS运行的远程Oracle数据库。我们可以使用spark-shell实用程序加载数据,如下所述

使用添加到类路径中的jdbc和oraclepki jar启动spark外壳

 spark-shell --driver-class-path /path/to/ojdbc8.jar:/path/to/oraclepki.jar

这是使用的JDBC URL:

 val JDBCURL="jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=www.example.aws.server.com)(PORT=1527))(CONNECT_DATA=(SID=XXX))(SECURITY = (SSL_SERVER_CERT_DN =\"C=US,ST=xxx,L=ZZZ,O=Amazon.com,OU=RDS,CN=www.xxx.aws.zzz.com\")))"

下面是Spark jdbc调用,用于加载数据

 spark.read.format("jdbc").option("url",JDBCURL)
.option("user","USER")
.option("oracle.net.tns_admin","/path/to/tnsnames.ora")
.option("oracle.net.wallet_location","(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/ssl_wallet/)))")
.option("password", "password")
.option("javax.net.ssl.trustStore","/path/to/cwallet.sso")
.option("javax.net.ssl.trustStoreType","SSO")
.option("dbtable",QUERY)
.option("driver", "oracle.jdbc.driver.OracleDriver").load    

但是当我们尝试使用spark-submit命令运行它时,出现以下错误:

    Exception in thread "main" java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
    at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:774)
    at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
    ...
    ...
    ...

    Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
    at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:523)
    at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:521)
    at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:660)
    at oracle.net.ns.NSProtocol.connect(NSProtocol.java:286)
    at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1438)
    at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:518)
    ... 28 more
    Caused by: oracle.net.ns.NetException: Unable to initialize ssl context.
    at oracle.net.nt.CustomSSLSocketFactory.getSSLSocketEngine(CustomSSLSocketFactory.java:597)
    at oracle.net.nt.TcpsNTAdapter.connect(TcpsNTAdapter.java:143)
    at oracle.net.nt.ConnOption.connect(ConnOption.java:161)
    at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:470)
    ... 33 more
    Caused by: oracle.net.ns.NetException: Unable to initialize the key store.
    at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:642)
    at oracle.net.nt.CustomSSLSocketFactory.getSSLSocketEngine(CustomSSLSocketFactory.java:580)
    ... 36 more
    Caused by: java.security.KeyStoreException: SSO not found
    at java.security.KeyStore.getInstance(KeyStore.java:851)
    at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:628)
    ... 37 more
    Caused by: java.security.NoSuchAlgorithmException: SSO KeyStore not available
    at sun.security.jca.GetInstance.getInstance(GetInstance.java:159)
    at java.security.Security.getImpl(Security.java:695)
    at java.security.KeyStore.getInstance(KeyStore.java:848)

我刚起步,可能在这里做错了。这就是我尝试配置Config的方式

    val conf = new SparkConf().setAppName(JOB_NAME)
    conf.set("javax.net.ssl.trustStore", "/path/to/cwallet.sso");
    conf.set("javax.net.ssl.trustStoreType", "SSO")
    conf.set("oracle.net.tns_admin", "/path/to/tnsnames.ora")
    conf.set("oracle.net.wallet_location", "(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/ssl_wallet/dir/)))")
    conf.set("user", "user")
    conf.set("password", "pass")

下面是使用的spark-submit命令

    spark-submit --class fully.qualified.path.to.main \
    --jars /path/to/ojdbc8.jar,/path/to/oraclepki.jar,/path/to/osdt_cert.jar,/path/to/osdt_core.jar \
    --deploy-mode client --files /path/to/hive-site.xml --master yarn  \
    --driver-memory 12G \
    --conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=/path/to/cwallet.sso -Djavax.net.ssl.trustStoreType=SSO" \
    --executor-cores 4 --executor-memory 12G \
    --num-executors 20 /path/to/application.jar /path/to/application_custom_config.conf

也尝试添加

--conf 'spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=/path/to/cwallet.sso -Djavax.net.ssl.trustStoreType=SSO'

--files /path/to/cwallet.sso,/path/to/tnsnames.ora

执行spark-submit命令,但没有任何运气。 我在这里到底在做什么错?还尝试了this post中提到的解决方案,但收到相同的错误。是否需要确保trustStore should be accessible on each executor node?如果是这样的话,为什么spark-shell命令运行正常?这是否意味着spark-cli不包含任何执行该命令的工作程序节点?

请咨询

更新:

  

似乎您正在使用12.1.0.2。中的JDBC驱动程序。请升级到18.3,您可以从oracle.com/technetwork/database/application-development/jdbc/下载。…已经进行了一些更改,以简化钱包的使用。 -@让·德·拉瓦琳(Jean de Lavarene)

在遵循@Jean de Lavarene的建议更改之后,摆脱了最初的错误,但是下面是我现在要得到的内容

    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, example.server.net, executor 2): java.sql.SQLException: PKI classes not found. To use 'connect /' functionality, oraclepki.jar must be in the classpath: java.lang.NoClassDefFoundError: oracle/security/pki/OracleWallet
    at oracle.jdbc.driver.PhysicalConnection.getSecretStoreCredentials(PhysicalConnection.java:3058)
    at oracle.jdbc.driver.PhysicalConnection.parseUrl(PhysicalConnection.java:2823) 

当我在spark本地模式下运行时:--master local[*]可以正常工作,但在yarn模式下无法运行。

我已经在用--jars命令和以逗号分隔的jar列表了。我发现的是:

1)--jars期望该路径是本地路径,然后将其复制到HDFS路径
2)开始使用file:///无效
3)如果未指定--jars参数,则程序正在询问缺少的JDBC驱动程序类。一旦我使用--jars指定了ojdbc8.jar,错误就会消失,并开始显示oraclepki.jar未找到错误。我不知道为什么会这样。
4)还尝试在指定多个罐但没有任何运气的同时使用:作为分隔符

更新2

我能够通过使用

解决oraclepki.jar找不到异常
    --driver-class-path /path/to/oraclepki.jar:/path/to/osdt_cert.jar:/path/to/others.jar 

但是一旦我们进入--master yarn模式,就会显示以下异常

    Caused by: oracle.net.ns.NetException: Unable to initialize the key store.
    at oracle.net.nt.CustomSSLSocketFactory.getKeyManagerArray(CustomSSLSocketFactory.java:617)
    at oracle.net.nt.CustomSSLSocketFactory.createSSLContext(CustomSSLSocketFactory.java:322)
    ... 32 more
    Caused by: java.io.FileNotFoundException: /path/to/cwallet.sso (No such file or directory)

据我了解,从工作程序节点启动作业时,cwallet.sso文件路径在这些节点上不可用。我们尝试为钱包指定HDFS路径,但该实用程序希望在创建钱包时提供本地路径。

那么我们是否需要手动将钱包文件复制到所有工作节点?还是有更好的选择来实现这一目标?

请咨询

1 个答案:

答案 0 :(得分:1)

基本上,这就是我们能够解决的方法。这里要记住的一件事是SSO文件必须存在于将运行Spark的所有节点上(火花的执行者节点)

    val SOURCE_DF = spark.read.format("jdbc")
        .option("url", "jdbc:oracle:thin:@...full string here")
        .option("oracle.net.wallet_location", "(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/path/to/sso/dir)))")
        ...
        ...

如果您需要传递其他详细信息,则可以添加更多.options参数

   .option("oracle.net.tns_admin", "oracle/tns/file/path"))
   .option("javax.net.ssl.trustStoreType", "sso")