加特林:对隐藏在HTML响应中的JSON执行检查

时间:2019-05-17 08:10:53

标签: gatling scala-gatling gatling-jsonpath

在加特林,我想对HTML响应中包含的某些JSON进行如下检查:

<!doctype html>
<html lang="fr">
  <head>
    <script>
      var documentLoaded = performance.now();
    </script>
    <link rel="stylesheet" href="/styles/main.f14d8fab5a7e.css">
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">

    <link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
    <link rel="manifest" href="/manifest.json">
    <link rel="preconnect" href="https://www.gstatic.com">

    <title data-react-helmet="true">Asus Discount</title>
    <meta data-react-helmet="true" name="description" content="Asus discount”/><meta data-react-helmet="true" name="keywords" content="Asus"/>

  </head>
  <body>
  <div>Some content</div>

  <script>
      var parseStart = performance.now();
  </script>

  <script>
    window.__INITIAL_STATE__ = {some JSON}; <!-- This is what I need -->
    window.__ENV_VARIABLES__ = {some other JSON};
    window.renderTime = '76';
    window.fetchTime = '349';
  </script>
  <script type="text/javascript" charset="utf-8" src="/vendor.e33d9940372.js"></script>
  <script type="application/ld+json" src="/schema.fr.json"></script>
  </body>
</html>

我的实际解决方案(正在运行)如下:

def loadPageJsonInHTML(requestName: String, link: String): ChainBuilder ={
  exec(
    http(requestName)
      .get(link)
      .check(regex("""window[.]__INITIAL_STATE__ = ([^;]+)""").find.transform(s => parseSToProdList(s)).saveAs("prod_list")
      )
  )
  doIf("${prod_list.size()}" == 0){
    exec{session => session.markAsFailed}
  }
}

def parseSToProdList(jsonString: String): Seq[String] ={
  val jsonMap = jsonStrToMap(jsonString)
  val buffer = mutable.Buffer.empty[String]
  jsonMap("products").asInstanceOf[Map[String, Any]].foreach{f =>
    if(f._2.asInstanceOf[Map[String, Any]].keySet.exists(_ == "code"))
      buffer.append(f._2.asInstanceOf[Map[String, Any]]("code").asInstanceOf[String])
  }
  buffer.toSeq
}

def jsonStrToMap(jsonStr: String): Map[String, Any] = {
  implicit val formats = org.json4s.DefaultFormats
  parse(jsonStr).extract[Map[String, Any]]
}

但是,此解决方案有几个缺点:

  1. 只要找到正则表达式,检查就始终会成功,并且不在乎JSON中是否有任何产品->我稍后必须手动进行检查;
  2. 具有提取所需数据的功能比我可以使用Json Path表达式(例如“ $ .products。*。code”,可以将其存储在集中式路径文件中以便于维护)更困难;
  3. 这是我唯一需要使用转换检查请求JSON的地方,这使得阅读和理解变得更加困难。

我想要实现的是看起来像这样的东西:

def loadPageJsonInHTML(requestName: String, link: String): ChainBuilder ={
  exec(
    http(requestName)
      .get(link)
      .check(jsonPath("""$.products.*.code""").findAll.saveAs("prod_list")
  )

def loadPageJsonInHTML(requestName: String, link: String): ChainBuilder ={
  exec(
    http(requestName)
      .get(link)
      .check(jsonpJsonPath("""$.products.*.code""").findAll.saveAs("prod_list")
  )

当然,jsonPath不起作用,因为大多数答案都是HTML。 jsonpJsonPath也不起作用,因为响应中有多个Json字符串。

关于如何在避免某些HTML上的正则表达式的同时,如何更有效(更好地)执行此操作,有什么好的建议?预先感谢

1 个答案:

答案 0 :(得分:0)

因此,在进行一些挖掘之后,我发现了使用“ .transformResponse”的解决方法,以便在实际检查之前提取字符串,并为其提供一个在Json中可解析的默认值。然后,为确保确实找到了正则表达式,请确保它不是我们的默认值:

  def loadPageJsonInHTML(requestName: String, link: String): ChainBuilder = {
    exec(
      http(requestName)
        .get(link)
        .transformResponse{(session, response) =>
          response.copy(body = new StringResponseBody(
              (for(m <- """window[.]__INITIAL_STATE__ = ([^;]+)""".r
                           .findFirstMatchIn(response.body.string)
                  ) yield m.group(1)
              ).getOrElse("""{"error":"chain not found"}"""),
              UTF_8
            )
          )
        }
        .check(bodyString.not("""{"error":"chain not found"}"""))
        .check(jsonPath("""$.products.*.code""").findAll.saveAs("prod_list")
        )
    )
  }