我已经在Intellij中使用Scala对Kafka Producer进行了编码,并传递了两个args作为文件。我使用了以下代码。
package kafkaProducer
import java.util.Properties
import org.apache.kafka.clients.producer._
import org.apache.spark._
import scala.io.Source
object kafkaProducerScala extends App {
val conf = new SparkConf().
setMaster(args(0)).
setAppName("kafkaProducerScala")
val sc = new SparkContext(conf)
sc.setLogLevel("ERROR")
val props = new Properties ()
props.put ("bootstrap.servers", "localhost:9092")
props.put ("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put ("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, String] (props)
val topic = "KafkaTopics"
for (line2 <- Source.fromFile (args (2) ).getLines) {
val c = line2.toInt
for (line <- Source.fromFile (args (1) ).getLines) {
val a = line.toInt
val b = if (a > c) {
var d = a
println(d)
val record = new ProducerRecord[String, String] (topic, d.toString)
producer.send (record)
}
}
producer.close ()
}
}
以下是build.sbt文件
name := "KafkaProducer"
version := "0.1"
scalaVersion := "2.12.7"
libraryDependencies += "org.apache.kafka" %% "kafka" % "2.0.1"
resolvers += Resolver.mavenLocal
// https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "2.0.1"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.0"
我的目标是在Kafka Consumer中获得输出。我很好。 然后,我为火花提交创建了一个.jar文件。
我给出了以下spark-submit命令
C:\spark-2.3.1-bin-hadoop2.7\bin>spark-submit --class kafkaProducer.kafkaProducerScala C:\Users\Shaheel\IdeaProjects\KafkaProducer\target\scala-2.12\k
afkaproducer_2.12-0.1.jar local C:\Users\Shaheel\Desktop\demo.txt C:\Users\Shaheel\Desktop\condition.properties
但是我遇到以下错误
2018-11-28 17:53:58 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/kafka/clients/producer/KafkaProducer
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
at java.lang.Class.privateGetMethodRecursive(Unknown Source)
at java.lang.Class.getMethod0(Unknown Source)
at java.lang.Class.getMethod(Unknown Source)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:42)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.clients.producer.KafkaProducer
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 11 more
2018-11-28 17:53:58 INFO ShutdownHookManager:54 - Shutdown hook called
2018-11-28 17:53:58 INFO ShutdownHookManager:54 - Deleting directory C:\Users\Shaheel\AppData\Local\Temp\spark-96060579-36cc-4c68-b85e-429acad4fd38
帮我解决它。
答案 0 :(得分:1)
您使用的Scala版本为2.12.7,而Spark仍使用Scala 2.11版本构建
Spark可在Windows和类似UNIX的系统(例如Linux,Mac OS)上运行。在一台计算机上本地运行很容易-您所需要做的就是在系统PATH上安装Java或指向Java安装的JAVA_HOME环境变量。
Spark可在Java 8 +,Python 2.7 + / 3.4 +和R 3.1+上运行。对于Scala API,Spark 2.4.0使用Scala 2.11。您将需要使用兼容的Scala版本(2.11.x)。
请注意,自Spark 2.2.0起已删除了对Java 7,Python 2.6和2.6.5之前的旧Hadoop版本的支持。从2.3.0版本开始,不再支持Scala 2.10。
以上摘录直接取自Apache Spark(v2.4.0)的文档页面。
将您的Scala版本更改为 2.11.12 ,并将sbt-assembly插件添加到您的 plugins.sbt 文件中。您需要做的就是在项目的根目录(src和build.sbt在一起的位置)中运行命令sbt assembly
,创建的jar将包含kafka-client的依赖项
更正后的build.sbt如下:
val sparkVersion="2.4.0"
name := "KafkaProducer"
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq("org.apache.kafka" % "kafka-clients" % "2.0.1",
"org.apache.spark" %% "spark-core" % sparkVersion % Provided)
Apache Spark的依赖项始终在提供的范围内使用,因为Spark在运行时将其提供给代码。
答案 1 :(得分:0)
sparks类路径中没有kafka jar。您要么必须使用--jars
将其传递到提交中,要么将其打包到您自己的jar(胖罐)中