SparkStreaming:从Kafka检索消息并以二进制形式写入HDFS

时间:2019-06-17 10:43:50

标签: apache-spark apache-kafka spark-streaming

当尝试使用RDD.saveAsObjectFile(targetHDFSDir)检索值并将其写入HDFS时,出现不可序列化的异常。我希望能够读取二进制(字节数组)并写入二进制(字节数组)。

Key and Value deserializer: org.apache.kafka.common.serialization.ByteArrayDeserializer

>

  val kafkaParams = Map[String, String](
    org.apache.kafka.clients.consumer.ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG -> bootStrapServer,
    org.apache.kafka.clients.consumer.ConsumerConfig.GROUP_ID_CONFIG -> "Spark1",
    org.apache.kafka.clients.consumer.ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG -> "org.apache.kafka.common.serialization.ByteArrayDeserializer",
    org.apache.kafka.clients.consumer.ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG -> "org.apache.kafka.common.serialization.ByteArrayDeserializer",
    org.apache.kafka.clients.consumer.ConsumerConfig.AUTO_OFFSET_RESET_CONFIG -> "earliest",
    org.apache.kafka.clients.consumer.ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG -> "false",
    org.apache.kafka.clients.CommonClientConfigs.SECURITY_PROTOCOL_CONFIG -> "PLAINTEXT"
    )

0 个答案:

没有答案