我当时使用spark-atlas-connector(https://github.com/hortonworks-spark/spark-atlas-connector)将Spark作业的元数据推送到Atlas。我按照提到的git链接中提供的步骤进行操作。
在运行基本的单词计数火花作业时,出现以下错误:
18/07/25 11:57:52 ERROR ClientResponse: A message body reader for Java class org.apache.atlas.model.typedef.AtlasTypesDef, and Java type class org.apache.atlas.model.typedef.AtlasTypesDef, and MIME media type application/json;charset=UTF-8 was not found
18/07/25 11:57:52 ERROR ClientResponse: The registered message body readers compatible with the MIME media type are:
*/* ->
com.sun.jersey.core.impl.provider.entity.FormProvider
com.sun.jersey.core.impl.provider.entity.StringProvider
com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
com.sun.jersey.core.impl.provider.entity.FileProvider
com.sun.jersey.core.impl.provider.entity.InputStreamProvider
com.sun.jersey.core.impl.provider.entity.DataSourceProvider
com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
com.sun.jersey.core.impl.provider.entity.ReaderProvider
com.sun.jersey.core.impl.provider.entity.DocumentProvider
com.sun.jersey.core.impl.provider.entity.SourceProvider$StreamSourceReader
com.sun.jersey.core.impl.provider.entity.SourceProvider$SAXSourceReader
com.sun.jersey.core.impl.provider.entity.SourceProvider$DOMSourceReader
com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
com.sun.jersey.core.impl.provider.entity.XMLRootObjectProvider$General
com.sun.jersey.core.impl.provider.entity.EntityHolderReader
18/07/25 11:57:52 ERROR SparkAtlasEventTracker: Fail to initialize Atlas client, stop this listener
org.apache.atlas.AtlasServiceException: Metadata service API GET : api/atlas/v2/types/typedefs/ failed
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:325)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:287)
at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:469)
at org.apache.atlas.AtlasClientV2.getAllTypeDefs(AtlasClientV2.java:131)
at com.hortonworks.spark.atlas.RestAtlasClient.getAtlasTypeDefs(RestAtlasClient.scala:58)
at com.hortonworks.spark.atlas.types.SparkAtlasModel$$anonfun$checkAndGroupTypes$1.apply(SparkAtlasModel.scala:107)
at com.hortonworks.spark.atlas.types.SparkAtlasModel$$anonfun$checkAndGroupTypes$1.apply(SparkAtlasModel.scala:104)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
at com.hortonworks.spark.atlas.types.SparkAtlasModel$.checkAndGroupTypes(SparkAtlasModel.scala:104)
at com.hortonworks.spark.atlas.types.SparkAtlasModel$.checkAndCreateTypes(SparkAtlasModel.scala:71)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.initializeSparkModel(SparkAtlasEventTracker.scala:108)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:48)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:39)
at com.hortonworks.spark.atlas.SparkAtlasEventTracker.<init>(SparkAtlasEventTracker.scala:43)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2743)
at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2732)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2732)
at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$$lessinit$greater$1.apply(QueryExecutionListener.scala:83)
at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$$lessinit$greater$1.apply(QueryExecutionListener.scala:82)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.util.ExecutionListenerManager.<init>(QueryExecutionListener.scala:82)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$listenerManager$2.apply(BaseSessionStateBuilder.scala:270)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$listenerManager$2.apply(BaseSessionStateBuilder.scala:270)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.listenerManager(BaseSessionStateBuilder.scala:269)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:297)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1070)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:141)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:140)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:140)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:137)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:178)
at org.apache.spark.sql.Dataset$.apply(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:470)
at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228)
at com.oi.spline.main.SparkAtlasConnector$.main(SparkAtlasConnector.scala:20)
at com.oi.spline.main.SparkAtlasConnector.main(SparkAtlasConnector.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:906)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.sun.jersey.api.client.ClientHandlerException: A message body reader for Java class org.apache.atlas.model.typedef.AtlasTypesDef, and Java type class org.apache.atlas.model.typedef.AtlasTypesDef, and MIME media type application/json;charset=UTF-8 was not found
at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:630)
at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:604)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:321)
... 59 more
18/07/25 11:57:53 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
我试图找到错误,但得到了有关jersey-json依赖项,@ XmlRootElement等的所有解决方案,而对于这种火花代码而言,情况并非如此。
火花代码是一个基本的字数统计程序。我正在使用spark-atlas-connector执行它,以查看Apache地图集上的工作沿袭,但我面临提到的Exception。 我的集群是kerberized集群,我已遵循针对kerberized环境给出的所有步骤。任何帮助将不胜感激。