SAP Vora 1.2 - 从HANA阅读Vora表

时间:2016-05-12 07:33:42

标签: hana vora

!!!更新!!!

最后,在查看文档数小时之后,我发现了这个问题。事实证明,我在Yarn配置中缺少一些参数。

这就是我所做的:

  1. 在编辑器中打开yarn-site.xml文件或登录Ambari Web UI并选择Yarn> Config。 找到属性“yarn.nodemanager.aux-services”并将“spark_shuffle”添加到其当前 值。新属性名称应为“mapreduce_shuffle,spark_shuffle”。
  2. 添加或编辑属性“yarn.nodemanager.aux-services.spark_shuffle.class”,并将其设置为 “org.apache.spark.network.yarn.YarnShuffleService”。
  3. 复制spark-yarn-shuffle.jar文件(在安装Spark Assembly步骤中下载) 文件和从属库)从Spark到Hadoop-Yarn所有节点管理器中的类路径 主机。通常,此文件夹位于/ usr / hdp // hadoop-yarn / lib。
  4. 重新启动Yarn和节点管理器
  5. !!!!!!!!!!!

    我正在使用带有最新Spark控制器的SAP Vora 1.2 Developer Edition(HANASPARKCTRL00P_5-70001262.RPM)。我在火花壳中加载了一张桌子进入Vora。我可以在“spark_velocity”文件夹中看到SAP HANA Studio中的表。我可以将表加载为虚拟表。问题是我无法选择或预览表中的数据,因为错误:

      

    错误:SAP DBTech JDBC:[403]:内部错误:打开时出错   用于查询的远程数据库的游标“SELECT   “SPARK_testtable”。“a1”,“SPARK_testtable”。“a2”,“SPARK_testtable”。“a3”   来自“spark_velocity”。“testtable”“SPARK_testtable”LIMIT 200“

    这是我的hanaes-site.xml文件:

    <configuration>
        <!--  You can either copy the assembly jar into HDFS or to lib/external directory.
        Please maintain appropriate value here-->
        <property>
            <name>sap.hana.es.spark.yarn.jar</name>
            <value>file:///usr/sap/spark/controller/lib/external/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar</value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.server.port</name>
            <value>7860</value>
            <final>true</final>
        </property>
        <!--  Required if you are copying your files into HDFS-->
         <property>
             <name>sap.hana.es.lib.location</name>
             <value>hdfs:///sap/hana/spark/libs/thirdparty/</value>
             <final>true</final>
         </property>
         -->
        <!--Required property if using controller for DLM scenarios-->
        <!--
        <property>
            <name>sap.hana.es.warehouse.dir</name>
            <value>/sap/hana/hanaes/warehouse</value>
            <final>true</final>
        </property>
    -->
        <property>
            <name>sap.hana.es.driver.host</name>
            <value>ip-10-0-0-[censored].ec2.internal</value>
            <final>true</final>
        </property>
        <!-- Change this value to vora when connecting to Vora store -->
        <property>
            <name>sap.hana.hadoop.datastore</name>
            <value>vora</value>
            <final>true</final>
        </property>
    
        <!-- // When running against a kerberos protected cluster, please maintain appropriate values
        <property>
            <name>spark.yarn.keytab</name>
            <value>/usr/sap/spark/controller/conf/hanaes.keytab</value>
            <final>true</final>
        </property>
        <property>
            <name>spark.yarn.principal</name>
            <value>hanaes@PAL.SAP.CORP</value>
            <final>true</final>
        </property>
    -->
        <!-- To enable Secure Socket communication, please maintain appropriate values in the follwing section-->
        <property>
            <name>sap.hana.es.ssl.keystore</name>
            <value></value>
            <final>false</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.clientauth.required</name>
            <value>true</value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.verify.hostname</name>
            <value>true</value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.keystore.password</name>
            <value></value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.truststore</name>
            <value></value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.truststore.password</name>
            <value></value>
            <final>true</final>
        </property>
        <property>
            <name>sap.hana.es.ssl.enabled</name>
            <value>false</value>
            <final>true</final>
        </property>
    
        <property>
            <name>spark.executor.instances</name>
            <value>10</value>
            <final>true</final>
        </property>
        <property>
            <name>spark.executor.memory</name>
            <value>5g</value>
            <final>true</final>
        </property>
        <!-- Enable the following section if you want to enable dynamic allocation-->
        <!--
        <property>
            <name>spark.dynamicAllocation.enabled</name>
            <value>true</value>
            <final>true</final>
        </property>
    
        <property>
            <name>spark.dynamicAllocation.minExecutors</name>
            <value>10</value>
            <final>true</final>
        </property>
        <property>
            <name>spark.dynamicAllocation.maxExecutors</name>
            <value>20</value>
            <final>true</final>
        </property>
        <property>
        <name>spark.shuffle.service.enabled</name>
        <value>true</value>
        <final>true</final>
       </property>
    <property>
             <name>sap.hana.ar.provider</name>
             <value>com.sap.hana.aws.extensions.AWSResolver</value>
             <final>true</final>
         </property>
    <property>
            <name>spark.vora.hosts</name>
            <value>ip-10-0-0-[censored].ec2.internal:2022,ip-10-0-0-[censored].ec2.internal:2022,ip-10-0-0-[censored].ec2.internal:2022</value>
            <final>true</final>
         </property>
         <property>
            <name>spark.vora.zkurls</name>
            <value>ip-10-0-0-[censored].ec2.internal:2181,ip-10-0-0-[censored].ec2.internal:2181,ip-10-0-0-[censored].ec2.internal:2181</value>
            <final>true</final>
         </property>
    </configuration>
    

    ls / usr / sap / spark / controller / lib / external /

    spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar
    

    hdfs dfs -ls / sap / hana / spark / libs / thirdparty

    Found 4 items
    -rwxrwxrwx   3 hdfs hdfs     366565 2016-05-11 13:09 /sap/hana/spark/libs/thirdparty/datanucleus-api-jdo-4.2.1.jar
    -rwxrwxrwx   3 hdfs hdfs    2006182 2016-05-11 13:09 /sap/hana/spark/libs/thirdparty/datanucleus-core-4.1.2.jar
    -rwxrwxrwx   3 hdfs hdfs    1863315 2016-05-11 13:09 /sap/hana/spark/libs/thirdparty/datanucleus-rdbms-4.1.2.jar
    -rwxrwxrwx   3 hdfs hdfs     627814 2016-05-11 13:09 /sap/hana/spark/libs/thirdparty/joda-time-2.9.3.jar
    

    ls / usr / hdp /

    2.3.4.0-3485  2.3.4.7-4  current
    

    vi /var/log/hanaes/hana_controller.log

    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/sap/spark/controller/lib/spark-sap-datasources-1.2.33-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/sap/spark/controller/lib/external/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/hdp/2.3.4.0-3485/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    16/05/12 07:02:38 INFO HanaESConfig: Loaded HANA Extended Store Configuration
    Found Spark Libraries. Proceeding with Current Class Path
    16/05/12 07:02:39 INFO Server: Starting Spark Controller
    16/05/12 07:03:11 INFO CommandRouter: Connecting to Vora Engine
    16/05/12 07:03:11 INFO CommandRouter: Initialized Router
    16/05/12 07:03:11 INFO CommandRouter: Server started
    16/05/12 07:03:43 INFO CommandHandler: Getting BROWSE data/user/17401406272892502037-4985062628452729323_f17e36cf-0003-0015-452e-800c700001ee
    16/05/12 07:03:48 INFO CommandHandler: Getting BROWSE data/user/17401406272892502037-4985062628452729329_f17e36cf-0003-0015-452e-800c700001f4
    16/05/12 07:03:48 INFO VoraClientFactory: returning a Vora catalog client of this Vora catalog server: master.i-14371789.cluster:2204
    16/05/12 07:03:48 INFO CBinder: searching for compat-sap-c++.so at /opt/rh/SAP/lib64/compat-sap-c++.so
    16/05/12 07:03:48 WARN CBinder: could not find compat-sap-c++.so
    16/05/12 07:03:48 INFO CBinder: searching for libpam.so.0 at /lib64/libpam.so.0
    16/05/12 07:03:48 INFO CBinder: loading libpam.so.0 from /lib64/libpam.so.0
    16/05/12 07:03:48 INFO CBinder: loading library libprotobuf.so
    16/05/12 07:03:48 INFO CBinder: loading library libprotoc.so
    16/05/12 07:03:48 INFO CBinder: loading library libtbbmalloc.so
    16/05/12 07:03:48 INFO CBinder: loading library libtbb.so
    16/05/12 07:03:48 INFO CBinder: loading library libv2runtime.so
    16/05/12 07:03:48 INFO CBinder: loading library libv2net.so
    16/05/12 07:03:48 INFO CBinder: loading library libv2catalog_connector.so
    16/05/12 07:03:48 INFO CatalogFactory: returning a Vora catalog client of this Vora catalog server: master.i-14371789.cluster:2204
    16/05/12 07:11:56 INFO CommandHandler: Getting BROWSE data/user/17401406272892502037-4985062628452729335_f17e36cf-0003-0015-452e-800c700001fa
    16/05/12 07:11:56 INFO Utils: freeing the buffer
    16/05/12 07:11:56 INFO Utils: freeing the buffer
    16/05/12 07:12:02 INFO Utils: freeing the buffer
    16/05/12 07:12:02 WARN DefaultSource: Creating a Vora Relation that is actually persistent with a temporary statement!
    16/05/12 07:12:02 WARN DefaultSource: Creating a Vora Relation that is actually persistent with a temporary statement!
    16/05/12 07:12:02 INFO CatalogFactory: returning a Vora catalog client of this Vora catalog server: master.i-14371789.cluster:2204
    16/05/12 07:12:02 INFO Utils: freeing the buffer
    16/05/12 07:12:02 INFO DefaultSource: Creating VoraRelation testtable using an existing catalog table
    16/05/12 07:12:02 INFO Utils: freeing the buffer
    16/05/12 07:12:11 INFO Utils: freeing the buffer
    16/05/12 07:14:15 ERROR RequestOrchestrator: Result set was not fetched by connected Client. Hence cancelled the execution
    16/05/12 07:14:15 ERROR RequestOrchestrator: org.apache.spark.SparkException: Job 0 cancelled part of cancelled job group f17e36cf-0003-0015-452e-800c70000216
            at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
            at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1229)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply$mcVI$sp(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:681)
            at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
            at org.apache.spark.scheduler.DAGScheduler.handleJobGroupCancelled(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1475)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
            at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
            at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
            at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:902)
            at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:900)
            at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
            at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
            at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
            at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:900)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2$$anonfun$applyOrElse$7.apply(CommandRouter.scala:383)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2$$anonfun$applyOrElse$7.apply(CommandRouter.scala:362)
            at scala.collection.immutable.List.foreach(List.scala:318)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2.applyOrElse(CommandRouter.scala:362)
            at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
            at com.sap.hana.spark.network.CommandHandler.aroundReceive(CommandRouter.scala:204)
            at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
            at akka.actor.ActorCell.invoke(ActorCell.scala:487)
            at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
            at akka.dispatch.Mailbox.run(Mailbox.scala:220)
            at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
            at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
            at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
            at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
            at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    

    这个错误也很奇怪:

    16/05/12 07:03:48 INFO CBinder: searching for compat-sap-c++.so at /opt/rh/SAP/lib64/compat-sap-c++.so
        16/05/12 07:03:48 WARN CBinder: could not find compat-sap-c++.so
    

    因为我在这个位置有这个文件:

    ls / opt / rh / SAP / lib64 /

    compat-sap-c++.so
    

    将com.sap.hana.aws.extensions.AWSResolver更改为com.sap.hana.spark.aws.extensions.AWSResolver之后,日志文件看起来不同了:

        16/05/17 10:04:08 INFO CommandHandler: Getting BROWSE data/user/9110494231822270485-5373255807276155190_7e6efa3c-0003-0015-4a91-a3b020000139
    16/05/17 10:04:13 INFO CommandHandler: Getting BROWSE data/user/9110494231822270485-5373255807276155196_7e6efa3c-0003-0015-4a91-a3b02000013f
    16/05/17 10:04:13 INFO Utils: freeing the buffer
    16/05/17 10:04:13 INFO Utils: freeing the buffer
    16/05/17 10:04:13 INFO Utils: freeing the buffer
    16/05/17 10:04:13 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 WARN DefaultSource: Creating a Vora Relation that is actually persistent with a temporary statement!
    16/05/17 10:04:29 WARN DefaultSource: Creating a Vora Relation that is actually persistent with a temporary statement!
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO DefaultSource: Creating VoraRelation testtable using an existing catalog table
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO ConfigurableHostMapper: Load Strategy: RELAXEDLOCAL (default)
    16/05/17 10:04:29 INFO HdfsBlockRetriever: Length of HDFS file (/user/vora/test.csv): 10 bytes.
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO ConfigurableHostMapper: Load Strategy: RELAXEDLOCAL (default)
    16/05/17 10:04:29 INFO TableLoader: Loading table [testtable]
    16/05/17 10:04:29 INFO ConfigurableHostMapper: Load Strategy: RELAXEDLOCAL (default)
    16/05/17 10:04:29 INFO TableLoader: Initialized 1 loading threads. Waiting until finished... -- 0.00 s
    16/05/17 10:04:29 INFO TableLoader: [secondary2.i-a5361638.cluster:2202] Host mapping (Ranges: 1/1 Size: 0.00 MB)
    16/05/17 10:04:29 INFO VoraJdbcClient: [secondary2.i-a5361638.cluster:2202] MultiLoad: MULTIFILE
    16/05/17 10:04:29 INFO TableLoader: [secondary2.i-a5361638.cluster:2202] Host finished:
        Raw ranges: 1/1
        Size:       0.00 MB
        Time:       0.29 s
        Throughput: 0.00 MB/s
    16/05/17 10:04:29 INFO TableLoader: Finished 1 loading threads. -- 0.29 s
    16/05/17 10:04:29 INFO TableLoader: Updated catalog -- 0.01 s
    16/05/17 10:04:29 INFO TableLoader: Table load statistics:
        Name: testtable
        Size: 0.00 MB
        Hosts: 1
        Time: 0.30 s
        Cluster throughput: 0.00 MB/s
        Avg throughput per host: 0.00 MB/s
    16/05/17 10:04:29 INFO Utils: freeing the buffer
    16/05/17 10:04:29 INFO TableLoader: Loaded table [testtable] -- 0.37 s
    16/05/17 10:04:38 INFO Utils: freeing the buffer
    16/05/17 10:06:43 ERROR RequestOrchestrator: Result set was not fetched by connected Client. Hence cancelled the execution
    16/05/17 10:06:43 ERROR RequestOrchestrator: org.apache.spark.SparkException: Job 1 cancelled part of cancelled job group 7e6efa3c-0003-0015-4a91-a3b02000015b
            at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
            at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1229)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply$mcVI$sp(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:681)
            at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
            at org.apache.spark.scheduler.DAGScheduler.handleJobGroupCancelled(DAGScheduler.scala:681)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1475)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
            at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
            at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
            at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
            at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
            at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:902)
            at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:900)
            at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
            at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
            at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
            at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:900)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2$$anonfun$applyOrElse$7.apply(CommandRouter.scala:383)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2$$anonfun$applyOrElse$7.apply(CommandRouter.scala:362)
            at scala.collection.immutable.List.foreach(List.scala:318)
            at com.sap.hana.spark.network.CommandHandler$$anonfun$receive$2.applyOrElse(CommandRouter.scala:362)
            at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
            at com.sap.hana.spark.network.CommandHandler.aroundReceive(CommandRouter.scala:204)
            at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
            at akka.actor.ActorCell.invoke(ActorCell.scala:487)
            at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
            at akka.dispatch.Mailbox.run(Mailbox.scala:220)
            at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
            at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
            at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
            at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
            at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    

    我仍然“没有被客户端提取”,但现在它看起来vora加载了表。

    任何人,一些想法如何解决它?当我尝试阅读Vora的Hive表时会出现同样的错误。

      

    错误:SAP DBTech JDBC:[403]:内部错误:打开时出错   用于查询的远程数据库的游标“SELECT   “vora_conn_testtable”。“a1”,“vora_conn_testtable”。“a2”,   “vora_conn_testtable”。“a3”FROM“spark_velocity”。“testtable”   “vora_conn_testtable”LIMIT 200“

4 个答案:

答案 0 :(得分:1)

我已经面临同样的问题并立即解决了! 其原因是HANA无法理解工作节点的主机名。 Spark控制器发送HANA工作者节点名称,其中包含Spark RDD。 如果HANA无法理解其主机名,则HANA无法获得结果并发生错误。

请检查HANA上的主机文件。

答案 1 :(得分:0)

日志显示错误Result set was not fetched by connected Client. Hence cancelled the execution。此上下文中的客户端是HANA尝试从Vora中获取。

错误可能是由HANA和Vora之间的连接问题引起的。

  1. hanaes-site.xml显示sap.hana.ar.provider=com.sap.hana.aws.extensions.AWSResolver。这看起来像一个错字。假设您在部署aws.resolver-1.5.8.jar后使用lib目录中包含的HANASPARKCTRL00P_5-70001262.RPM,则正确的路径应为com.sap.hana.spark.aws.extensions.AWSResolver。请参阅SAP Note 2273047 - SAP HANA Spark Controller SPS 11 (Compatible with Spark 1.5.2)
  2. 附带的PDF文件
  3. 确保所需端口已打开:请参阅HANA Admin Guide - &gt; 9.2.3.3火花控制器配置参数 - &gt;所有Spark执行器节点上的端口56000-58000
  4. 如果问题仍然存在,您可以检查Spark执行程序日志中的问题:

    1. 启动Spark Controller并重现问题/错误。
    2. 导航至http://:8088的Yarn ResoureManager用户界面(Ambari通过Ambari提供快速链接 - &gt;纱线 - &gt;快速链接 - &gt;资源管理器用户界面)
    3. 在Yarn ResourceManager UI中,单击正在运行的Spark Controller应用程序的“Tracking UI”列中的“ApplicationMaster”链接
    4. 在Spark UI上,单击“Executors”选项卡。然后,对于每个执行程序,单击“stdout”和“stderr”并检查错误
    5. 不相关:Vora 1.2不推荐使用这些参数,您可以从hanaes-site.xml中删除它们:spark.vora.hosts,spark.vora.zkurls

答案 2 :(得分:0)

最后,在查看文档数小时之后,我发现了这个问题。事实证明,我在Yarn配置中缺少一些参数(不知道为什么会影响HANA-Vora连接)。

这就是我所做的:

在编辑器中打开yarn-site.xml文件或登录Ambari Web UI并选择Yarn&gt; Config。找到属性“yarn.nodemanager.aux-services”并将“spark_shuffle”添加到其当前值。新属性名称应为“mapreduce_shuffle,spark_shuffle”。 添加或编辑属性“yarn.nodemanager.aux-services.spark_shuffle.class”,并将其设置为“org.apache.spark.network.yarn.YarnShuffleService”。 将spark-yarn-shuffle.jar文件从Spark复制到所有节点管理器主机中的Hadoop-Yarn类路径。通常,此文件夹位于/ usr / hdp // hadoop-yarn / lib中。 重新启动Yarn和节点管理器

答案 3 :(得分:0)

我在这个问题上挣扎了几天,这是由于Spark控制器上的端口被阻塞造成的。我们在AWS上运行此环境,我能够通过更新Spark主机的安全组并打开端口7800-7899来解决错误,之后HANA能够在SDA中查看HIVE表。

希望有一天能帮助别人:)