我的火花流媒体应用程序中有2个接收器。当我将spark.streaming.receiver.maxRate
设置为1000时,理论上最大速率应为2 * 1000。但事实上,速度通常略高于2000 / s,有时会达到3000 / s,这是正常还是有问题?
P.S。 Spark版本是1.4.0,数据来自Kafka。配置如下:
conf.set("spark.executor.memory", "4g")
conf.set("spark.executor.extraJavaOptions", "-XX:+UseConcMarkSweepGC") // executor
conf.set("spark.driver.extraJavaOptions", "-XX:+UseConcMarkSweepGC") // driver
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.set("spark.streaming.receiver.maxRate", "1000") // maxRate: 1K per second per receiver
conf.set("spark.streaming.blockInterval", "100")
conf.set("spark.streaming.concurrentJobs", "3")
conf.set("spark.shuffle.consolidateFile", "true")
conf.set("spark.eventLog.enabled", "true")
conf.set("spark.streaming.stopGracefullyOnShutdown", "true") // not documented but works in 1.4.0
conf.set("spark.eventLog.dir", "hdfs://" + nameService + "/spark/app_history")