Question

我正在运行一个具有很多流查询的Spark结构化流应用。所有这些查询的进度都在stdout上报告。我要删除默认的StreamingQueryListener，因为它会打印出类似于以下太冗长的JSON：

{
  "id" : "03fc78fc-fe19-408c-a1ae-812d0e28fcee",
  "runId" : "8c247071-afba-40e5-aad2-0e6f45f22488",
  "name" : null,
  "timestamp" : "2017-08-14T20:30:00.004Z",
  "batchId" : 1,
  "numInputRows" : 432,
  "inputRowsPerSecond" : 0.9993568953312452,
  "processedRowsPerSecond" : 1380.1916932907347,
  "durationMs" : {
    "addBatch" : 237,
    "getBatch" : 26,
    "getOffset" : 0,
    "queryPlanning" : 1,
    "triggerExecution" : 313,
    "walCommit" : 45
  },
  "stateOperators" : [ ],
  "sources" : [ {
    "description" : "RateSource[rowsPerSecond=1, rampUpTimeSeconds=0, numPartitions=8]",
    "startOffset" : 0,
    "endOffset" : 432,
    "numInputRows" : 432,
    "inputRowsPerSecond" : 0.9993568953312452,
    "processedRowsPerSecond" : 1380.1916932907347
  } ],
  "sink" : {
    "description" : "ConsoleSink[numRows=20, truncate=true]"
  }
}

我已经使用StreamingQueryManager的{{1}}方法向应用程序添加了自定义侦听器，该方法以简洁的单行格式打印上述JSON数据，因此我想删除默认的侦听器。我本可以使用addListener()方法，但是我不知道将什么作为参数传递给它，以便实现我的目标。

如何在Spark结构化流中删除默认的StreamingQueryListener？

0 个答案: