我创建了一个使用Spark-Streaming和自定义接收方Google Pub / Sub的应用程序。
我达到了性能极限,并希望在不进行处理的情况下删除消息。我有一个想法store()
,用于阅读邮件
我使用了 apache / bahir 接收器
val pullResponse = client.projects().subscriptions().pull(subscriptionFullName, pullRequest).execute()
val receivedMessages = pullResponse.getReceivedMessages.asScala.toList
Utils.LOG.info(s"receivedMessages from PUB/SUB ${receivedMessages.size}")
rateLimiter.acquire(receivedMessages.size)
var factor: Int = 0
if (dropFactorBroad != null) {
factor = dropFactorBroad.value
} else {
Utils.LOG.info("dropFactorBroad is null")
}
val endIndex = if (factor > receivedMessages.length) receivedMessages.length else factor
val messagesToStore = receivedMessages.slice(0, receivedMessages.length - endIndex)
store(messagesToStore.map(x => {
val sm = new SparkPubsubMessage
sm.message = x.getMessage
sm
})
.iterator)
val ackRequest = new AcknowledgeRequest()
ackRequest.setAckIds(receivedMessages.map(x => x.getAckId).asJava)
client.projects().subscriptions().acknowledge(subscriptionFullName, ackRequest).execute()
dropFactorBroad
-是广播变量,它在每个onBatchCompleted
上更新(不再持久化并再次创建)
它不起作用,我得到
java.lang.NullPointerException
at com.mag.ingester.ReceiverDropFactorBroadcaster.value(ReceiverDropFactorBroadcaster.scala:20)
at com.mag.pubSubReceiver.PubsubReceiver.receive(PubsubInputDStream.scala:260)
at com.mag.pubSubReceiver.PubsubReceiver$$anon$1.run(PubsubInputDStream.scala:244)
ReceiverDropFactorBroadcaster
是dropFactorBroad
如何控制收货商店? 我应该杀死接收者更改变量并重新启动吗? (怎么办?)
谢谢