我遇到了问题,我们正在使用Kafka和spark。
我们正在使用forEachRDD:
messages.foreachRDD{ rdd =>
val newRDD = rdd.map{message =>
processMessage(message)}
println(newRDD.count())
}
但我们正在传递processMessage(message)方法。此方法将调用正在创建sparkContext的类。我一直在阅读,如果你在foreachRDD中创建了sparkContext,它会抛出一个错误。
我改变了它:
messages.map{
case (msg) =>
val newRDD3 = (processMessage(msg))
(newRDD3)
}
但我不确定这是否与foreachRDD相同。
你能帮帮我吗?
任何帮助都会非常感激。
答案 0 :(得分:0)
使用sparksession
func application(_ application: UIApplication, handleEventsForBackgroundURLSession identifier: String, completionHandler: @escaping () -> Void) {
AWSS3TransferUtility.interceptApplication(application, handleEventsForBackgroundURLSession: identifier, completionHandler: completionHandler)
}
答案 1 :(得分:0)
我创建了streamContext,然后声明了主题和KafkaParams。
最后,我创建了消息。请参阅以下代码:
def main(args: Array[String]) {
val date_today = new SimpleDateFormat("yyyy_MM_dd");
val date_today_hour = new SimpleDateFormat("yyyy_MM_dd_HH");
val PATH_SEPERATOR = "/";
val conf = ConfigFactory.load("spfin.conf")
println("kafka.duration --- "+ conf.getString("kafka.duration").toLong)
val mlFeatures: MLFeatures = new MLFeatures()
// Create context with custom second batch interval
//val sparkConf = new SparkConf().setAppName("SpFinML")
//val ssc = new StreamingContext(sparkConf, Seconds(conf.getString("kafka.duration").toLong))
val ssc = new StreamingContext(mlFeatures.sc, Seconds(conf.getString("kafka.duration").toLong))
// Create direct kafka stream with brokers and topics
val topicsSet = conf.getString("kafka.requesttopic").split(",").toSet
val kafkaParams = Map[String, Object](
"bootstrap.servers" -> conf.getString("kafka.brokers"),
"zookeeper.connect" -> conf.getString("kafka.zookeeper"),
"group.id" -> conf.getString("kafka.consumergroups2"),
"auto.offset.reset" -> conf.getString("kafka.autoOffset"),
"enable.auto.commit" -> (conf.getString("kafka.autoCommit").toBoolean: java.lang.Boolean),
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"security.protocol" -> "SASL_PLAINTEXT")
/* this code is to get messages from request topic*/
val messages = KafkaUtils.createDirectStream[String, String](
ssc,
LocationStrategies.PreferConsistent,
ConsumerStrategies.Subscribe[String, String](topicsSet, kafkaParams))
messages.foreachRDD{ rdd =>
val newRDD = rdd.map{message =>
processMessage(message,"Inside processMessage")}
println("Inside map")
// println(newRDD.count())
}
答案 2 :(得分:0)
processMessage方法是:
我想我可能需要改变这种方法,对吧?
def processMessage(message: ConsumerRecord[String, String],msg: String): ConsumerRecord[String, String] = {
println(msg)
println("Message processed is " + message.value())
val req_message = message.value()
//val tableName = conf.getString("hbase.tableName")
//println("Hbase table name : " + tableName)
//val decisionTree_res = "PredictionModelOutput "
// val decisionTree_res = PriorAuthPredict.processPriorAuthPredict(req_message)
// println(decisionTree_res)
//
// kafkaProducer(conf.getString("kafka.responsetopic"), decisionTree_res)
kafkaProducer(conf.getString("kafka.responsetopic"), """[{"payorId":53723,"therapyType":"RMIV","ndcNumber":"'66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"'9427535101","serviceDate":"20161102","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":22957.55,"daysOrUnits":140,"algorithmType":"LR ,RF ,NB ,VoteResult","label":0.0,"prediction":"0.0 ,0.0 ,0.0 ,0.0","finalPrediction":"Approved","rejectOutcome":"Y","neighborCounter":0,"probability":"0.9307022947278968 - 0.06929770527210313 ,0.9879908798891663 - 0.012009120110833629 ,1.0 - 0.0 ,","patientGender":"M","invoiceId":0,"therapyClass":"REMODULIN","patientAge":52,"npiId":0,"prescriptionId":0,"refillNo":0,"requestId":"419568891","requestDateTime":"20171909213055","responseId":"419568891","responseDateTime":"201801103503"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":1,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170218","exceedFlag":"N","appealFlag":"N","authNNType":"A"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":2,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170211","exceedFlag":"N","appealFlag":"N","authNNType":"A"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160714","responseDate":"20160908","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":3,"invoiceId":45631877,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":1,"authExpiry":"20170218","exceedFlag":"N","appealFlag":"N","authNNType":"A"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160714","responseDate":"20160908","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":4,"invoiceId":45631877,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":1,"authExpiry":"20170211","exceedFlag":"N","appealFlag":"N","authNNType":"A"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160621","responseDate":"20160818","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":19677.9,"daysOrUnits":120,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":5,"invoiceId":45226407,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":0,"authExpiry":"20170218","exceedFlag":"N","appealFlag":"N","authNNType":"A"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":6,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170218","exceedFlag":"N","appealFlag":"N","authNNType":"P"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9450829801","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":7,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170211","exceedFlag":"N","appealFlag":"N","authNNType":"P"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9450829801","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":8,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170218","exceedFlag":"N","appealFlag":"N","authNNType":"P"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9427535101","serviceDate":"20160829","responseDate":"20161020","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":9,"invoiceId":46347660,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":2,"authExpiry":"20170211","exceedFlag":"N","appealFlag":"N","authNNType":"P"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9556908,"procedureCode":"J3285","authNbr":"9450829801","serviceDate":"20160714","responseDate":"20160908","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":26237.2,"daysOrUnits":160,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":10,"invoiceId":45631877,"patientAge":0,"npiId":0,"prescriptionId":1174215,"refillNo":1,"authExpiry":"20170211","exceedFlag":"N","appealFlag":"N","authNNType":"P"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":4879911,"procedureCode":"J3285","authNbr":"9818182501","serviceDate":"20160901","responseDate":"20161027","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":13118.6,"daysOrUnits":80,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":11,"invoiceId":46419758,"patientAge":0,"npiId":0,"prescriptionId":1095509,"refillNo":7,"authExpiry":"20170626","exceedFlag":"N","appealFlag":"N","authNNType":"O"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":4056274,"procedureCode":"J3285","authNbr":"8914616801","serviceDate":"20160727","responseDate":"20161019","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":13118.6,"daysOrUnits":80,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":12,"invoiceId":45820271,"patientAge":0,"npiId":0,"prescriptionId":1055447,"refillNo":10,"authExpiry":"-","exceedFlag":"N","appealFlag":"N","authNNType":"O"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":3476365,"procedureCode":"J3285","authNbr":"9262852501","serviceDate":"20160809","responseDate":"20161013","serviceBranchId":65,"serviceDuration":30,"placeOfService":12,"charges":16398.25,"daysOrUnits":100,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":13,"invoiceId":46027459,"patientAge":0,"npiId":0,"prescriptionId":1169479,"refillNo":2,"authExpiry":"20161231","exceedFlag":"N","appealFlag":"N","authNNType":"O"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":1064449,"procedureCode":"J3285","authNbr":"9327540001","serviceDate":"20160825","responseDate":"20161013","serviceBranchId":35,"serviceDuration":14,"placeOfService":12,"charges":6559.3,"daysOrUnits":40,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":14,"invoiceId":46303200,"patientAge":0,"npiId":0,"prescriptionId":714169,"refillNo":0,"authExpiry":"20170112","exceedFlag":"N","appealFlag":"N","authNNType":"O"},{"payorId":53723,"therapyType":"RMIV","ndcNumber":"66302010201","patientId":9205248,"procedureCode":"J3285","authNbr":"0000000","serviceDate":"20160823","responseDate":"20161013","serviceBranchId":65,"serviceDuration":20,"placeOfService":12,"charges":6559.3,"daysOrUnits":40,"algorithmType":"NN","label":0.0,"rejectOutcome":"Y","neighborCounter":15,"invoiceId":46257476,"patientAge":0,"npiId":0,"prescriptionId":1206606,"refillNo":0,"authExpiry":"-","exceedFlag":"N","appealFlag":"N","authNNType":"O"}]""")
//saveToHBase(conf.getString("hbase.tableName"), req_message, decisionTree_res)
message
}
答案 3 :(得分:0)
如果有人对该解决方案感兴趣。
对于消息(InputDStream),我使用foreachRDD,将其转换为RDD,然后使用map从consumerRecord获取json。接下来,将RDD转换为Array [String]并将其传递给processMessage方法。
messages.foreachRDD{ rdd =>
val newRDD = rdd.map{message =>
val req_message = message.value()
(message.value())
}
println("Request messages: " + newRDD.count())
var resultrows = newRDD.collect()//.collectAsList()
processMessage(resultrows, mlFeatures: MLFeatures)
}
在processMessage方法中,有一个for循环来处理所有字符串。我们还将请求消息和响应消息插入到hbase表中。
def processMessage(message: Array[String], mlFeatures: MLFeatures) = {
for (j <- 0 until message.size){
val req_message = message(j)//.get(j).toString()
val decisionTree_res = PriorAuthPredict.processPriorAuthPredict(req_message,mlFeatures)
println("Message processed is " + req_message)
kafkaProducer(conf.getString("kafka.responsetopic"), decisionTree_res)
var startTime = new Date().getTime();
saveToHBase(conf.getString("hbase.tableName"), req_message, decisionTree_res)
var endTime = new Date().getTime();
println("Kafka Consumer savetoHBase took : "+ (endTime - startTime) / 1000 + " seconds")
}
}