我想使用Spark REST API获取指标并发布到云监视。但RESR API就像:
val url = "http://<host>:4040/api/v1/applications/<app-name>/stages"
如果我给主控主机和应用程序ID它可以工作,但我如何在工作中使用它并动态地计算我们的主主机和应用程序名称?有没有办法获得这些信息?
使用Spark 2.1
试过:
import org.apache.spark.sql.SparkSession
val id = spark.sparkContext.applicationId val url = spark.sparkContext.uiWebUrl.get
case class SparkStage(name: String, shuffleWriteBytes: Long, memoryBytesSpilled: Long, diskBytesSpilled: Long)
val path = url + "/api/v1/applications/" + id + "/stages"
implicit val formats = DefaultFormats
val json = fromURL(path).mkString
val stages: List[SparkStage] = parse(json).extract[List[SparkStage]]
我得到了:
java.io.IOException: Server returned HTTP response code: 500 for URL: http://112.21.2.151:4040/api/v1/applications/application_1515337161733_0001
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1876)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
at java.net.URL.openStream(URL.java:1045)
at scala.io.Source$.fromURL(Source.scala:141)
at scala.io.Source$.fromURL(Source.scala:131)
... 64 elided
答案 0 :(得分:3)
如果您知道主机,则可以查询applications
端点:
http://localhost:4040/api/v1/applications
并解析结果以获取应用程序ID。
要从应用程序中获取applicationId
和host
,请使用相应的SparkContext
方法:
val spark: SparkSession
spark.sparkContext.applicationId
spark.sparkContext.uiWebUrl