我在尝试使用kafka-topic
中的日志,对其进行处理并将其推送到solr
中时遇到上述错误。
当我添加solr发布部件时,由于可以在hdfs或控制台上使用kafka流进行消费和打印,因此出现了问题。对于这一特定部分,我遵循了这个简单的示例https://github.com/mganta/streaming-data/blob/master/src/main/java/com/example/streaming/CarEventsProcessor.java,因为我无法理解spark-solr
中lucidworks
库的文档。
val topics = //my topics
val kafkaParams = Map[String, Object](...)
val stream =
KafkaUtils.createDirectStream[String, String](
mySparkStreamingContext,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams)
)
val processed = //process stream
def convert(field_to_process: ...): SolrInputDocument = {
//create document to push to solr
//test with basic document
val doc = SolrSupport.autoMapToSolrInputDoc("", null, Map())
doc.addField("left", "right")
doc
}
SolrSupport.indexDStreamOfDocs(brokers, "table", 1, processed.map(convert))
ssc.start()
ssc.awaitTermination()
ssc.stop()
我怀疑这是一个依赖性错误。 我的以下pow.xml:
<spark.version>2.2.1</spark.version>
<scala.version>2.11.8</scala.version>
<scala.compat.version>2.11</scala.compat.version>
<spark.solr.version>3.4.5</spark.solr.version>
<solr.version>7.3.0</solr.version>
<fasterxml.version>2.9.9</fasterxml.version>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_${scala.compat.version}</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>com.lucidworks.spark</groupId>
<artifactId>spark-solr</artifactId>
<version>${spark.solr.version}</version>
</dependency>
<!-- slf4j libraries -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.25</version>
</dependency>
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-core</artifactId>
<version>${solr.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>${fasterxml.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${fasterxml.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>${fasterxml.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_${scala.compat.version}</artifactId>
<version>${fasterxml.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-jaxrs</artifactId>
<version>1.9.8</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-core-asl</artifactId>
<version>1.9.8</version>
</dependency>