我正在开发一个接收JSON消息并需要解析它的Spark Streaming Application。它有两部分,但JSON解析的一部分在测试时似乎是更大的开销。有没有办法优化这个?
import scala.util.parsing.json.JSON
val parsed = JSON.parseFull(formatted)
val subject = parsed.flatMap(_.asInstanceOf[Map[String, String]].get("subject")).toString.drop(5).dropRight(1)
val predicate = parsed.flatMap(_.asInstanceOf[Map[String, String]].get("predicate")).toString.drop(5).dropRight(1)
val obj = parsed.flatMap(_.asInstanceOf[Map[String, String]].get("object")).toString.drop(5).dropRight(1)
val label = parsed.flatMap(_.asInstanceOf[Map[String, String]].get("label")).toString.drop(5).dropRight(1)
val url = "http://" + elasticAddress.value + "/data/quad/"
val urlEncoded = java.net.URLEncoder.encode(label + subject + predicate + obj, "utf-8")
答案 0 :(得分:0)
您是否也在项目中使用Play框架?如果是这样的话,the Play JSON library肯定会减少你的代码以使事情更具可读性(比如容易转换为具有匹配结构的case class
),尽管我不知道它是如何优化的从效率的角度来看你。
答案 1 :(得分:0)
我已将其更改为:
import org.json4s.JsonAST.{JField, JObject, JString, JArray, JValue}
import org.json4s.jackson.JsonMethods.
val parsed = parse(data)
val output: List[(String, String, String, String)] = for {
JArray(sys) <- parsed
JObject(child) <- sys
JField("subject", JString(subject)) <- child
JField("predicate", JString(predicate)) <- child
JField("object", JString(obj)) <- child
JField("label", JString(label)) <- child
} yield (subject, predicate,obj, label)
val subject = output(0)._1
val predicate = output(0)._2
val obj = output(0)._3
val label = output(0)._4