Spark kafka avro制作人在结构化流媒体中

时间:2018-04-24 20:45:37

标签: apache-kafka spark-dataframe spark-streaming avro

我有一个有效的UDF,在avro中发送kafka消息,我知道这不是UDF的目的。我找不到一个很好的方法来实现这个目标,这很有效......但是我想知道这是不是一个非常糟糕的主意。有人有更好的方法吗?

#if you don't have a schema reg
var testSchema = "{\"type\":\"record\",\"name\":\"myrecord\",\"fields\":[{\"name\":\"f1\",\"type\":\"string\"}]}"
    val df = Seq(
      ("1")
    ).toDF("f1")

    val topic = "mytest"
    val brokers = "kafka01:9092"
    val schemaRegistryURL = "http://sr:8081"
    val subjectValueName = topic + "-value"

    val KafkaAvroProducerFunct: (String => String) = (value: String) => {
      val props = new Properties()
      props.put("bootstrap.servers", brokers)
      props.put("key.serializer", classOf[KafkaAvroSerializer].getCanonicalName)
      props.put("value.serializer", classOf[KafkaAvroSerializer].getCanonicalName)
      props.put("schema.registry.url", schemaRegistryURL)
      val producer = new KafkaProducer[GenericRecord, GenericRecord](props)
      val vProps = new kafka.utils.VerifiableProperties(props)
      val avro_schema = new RestService(schemaRegistryURL).getLatestVersion(subjectValueName)
      val messageSchema = new Schema.Parser().parse(avro_schema.getSchema)
      val avroRecord = new GenericData.Record(messageSchema)
      avroRecord.put("f1", value)
      //val record = new ProducerRecord(topic, "key", avroRecord)
      val record = new ProducerRecord[GenericRecord, GenericRecord](topic, avroRecord)
      producer.send(record)
      "sent"
    }

0 个答案:

没有答案