无法通过spark-atlas-connector用apache地图集设置我的spark应用程序。
我克隆了https://github.com/hortonworks-spark/spark-atlas-connector项目并执行了 mvn软件包。然后,将所有jar放入我的项目和设置配置中,如下所示:
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf()
.setAppName("atlas-test")
.setMaster("local[2]")
.set("spark.extraListeners", "com.hortonworks.spark.atlas.SparkAtlasEventTracker")
.set("spark.sql.queryExecutionListeners", "com.hortonworks.spark.atlas.SparkAtlasEventTracker")
.set("spark.sql.streaming.streamingQueryListeners", "com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker")
val spark = SparkSession.builder()
.config(sparkConf)
.enableHiveSupport()
.getOrCreate()
import spark.implicits._
val df = spark.read.format("kafka")
.option("kafka.bootstrap.servers", BROKER_SERVERS)
.option("subscribe", "foobar")
.option("startingOffset", "earliest")
.load()
df.show()
df.write
.format("kafka")
.option("kafka.bootstrap.servers", BROKER_SERVERS)
.option("topic", "foobar-out")
.save()
}
地图集是通过我拉的docker容器启动的。 带有Zookeper的卡夫卡也被我拉过的docker容器盯着。
这项工作无需使用spark-atlas-connector,但当我要添加连接器时,它将引发异常。
19/08/09 16:40:16 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2398)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:555)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
at Boot$.main(Boot.scala:21)
at Boot.main(Boot.scala)
Caused by: org.apache.atlas.AtlasException: Failed to load application properties
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:134)
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:86)
at com.hortonworks.spark.atlas.AtlasClientConf.configuration$lzycompute(AtlasClientConf.scala:25)
at com.hortonworks.spark.atlas.AtlasClientConf.configuration(AtlasClientConf.scala:25)
at com.hortonworks.spark.atlas.AtlasClientConf.get(AtlasClientConf.scala:50)
at com.hortonworks.spark.atlas.AtlasClient$.atlasClient(AtlasClient.scala:120)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:33)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:37)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2691)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2680)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2680)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2387)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2386)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2386)
... 8 more
Caused by: com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.ConfigurationException: Cannot locate configuration source null
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:259)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:238)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.<init>(AbstractFileConfiguration.java:197)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.PropertiesConfiguration.<init>(PropertiesConfiguration.java:284)
at org.apache.atlas.ApplicationProperties.<init>(ApplicationProperties.java:69)
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:125)
... 32 more
19/08/09 16:40:16 INFO SparkContext: SparkContext already stopped.
Exception in thread "main" org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2398)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:555)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
at Boot$.main(Boot.scala:21)
at Boot.main(Boot.scala)
Caused by: org.apache.atlas.AtlasException: Failed to load application properties
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:134)
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:86)
at com.hortonworks.spark.atlas.AtlasClientConf.configuration$lzycompute(AtlasClientConf.scala:25)
at com.hortonworks.spark.atlas.AtlasClientConf.configuration(AtlasClientConf.scala:25)
at com.hortonworks.spark.atlas.AtlasClientConf.get(AtlasClientConf.scala:50)
at com.hortonworks.spark.atlas.AtlasClient$.atlasClient(AtlasClient.scala:120)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:33)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:37)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2691)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2680)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2680)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2387)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2386)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2386)
... 8 more
Caused by: com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.ConfigurationException: Cannot locate configuration source null
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:259)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:238)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.AbstractFileConfiguration.<init>(AbstractFileConfiguration.java:197)
at com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.PropertiesConfiguration.<init>(PropertiesConfiguration.java:284)
at org.apache.atlas.ApplicationProperties.<init>(ApplicationProperties.java:69)
at org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:125)
... 32 more
19/08/09 16:40:17 INFO ShutdownHookManager: Shutdown hook called
答案 0 :(得分:2)
System.setProperty("atlas.conf", "") 是 OP 指出的正确解决方案。 SAC 使用 ApplicationProperties.java。
特别是它使用 ApplicationProperties.get。 源代码在这里: https://github.com/apache/atlas/blob/master/intg/src/main/java/org/apache/atlas/ApplicationProperties.java#L118
您可以看到变量 ATLAS_CONFIGURATION_DIRECTORY_PROPERTY 设置为“atlas.conf”: https://github.com/apache/atlas/blob/master/intg/src/main/java/org/apache/atlas/ApplicationProperties.java#L43
答案 1 :(得分:1)
以下应该可以解决问题。请注意 --files
和 --driver-class-path
选项是将此配置文件放置在 CLASSPATH 上并因此可用于 Atlas Client 类所必需的选项。
此外,spark-shell
使用相对于 Spark Atlas 连接器的路径,因此相应地进行更改。
$SPARK_HOME/bin/spark-shell \
--jars spark-atlas-connector-assembly/target/spark-atlas-connector-assembly-0.1.0-SNAPSHOT.jar \
--conf spark.extraListeners=com.hortonworks.spark.atlas.SparkAtlasEventTracker \
--conf spark.sql.queryExecutionListeners=com.hortonworks.spark.atlas.SparkAtlasEventTracker \
--conf spark.sql.streaming.streamingQueryListeners=com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker \
--files spark-atlas-connector/src/test/resources/atlas-application.properties \
--driver-class-path spark-atlas-connector/src/test/resources
答案 2 :(得分:0)
我相信您忘记了设置文档中的又一个步骤。您的错误源于
Caused by: com.hortonworks.spark.atlas.shade.org.apache.commons.configuration.ConfigurationException: Cannot locate configuration source null
并在您发布的github存储库中引用其自述文件:
还要确保Atlas配置文件atlas-application.properties
在驱动程序的类路径中。例如,将此文件放入<SPARK_HOME>/conf
。
答案 3 :(得分:0)
请从官方spark-atlas-connector github页面上参考此内容。 atlas-application.properties文件应该可以访问。
还要确保Atlas配置文件atlas-application.properties在驱动程序的类路径中。例如,将此文件放入/ conf。 如果您使用的是群集模式,请同时使用--files atlas-application.properties将此conf文件发送到远程驱动器。