我正在测试带有HDFS的Flume HTTP Source作为接收器。以下是使用的conf文件。
httpagent.sources = httpsource
httpagent.sinks = hdfs-file-sink
httpagent.channels = ch3
httpagent.sources.httpsource.type = http
httpagent.sources.httpsource.bind = address
httpagent.sources.httpsource.handler =org.apache.flume.sink.solr.morphline.BlobHandler httpagent.sources.httpsource.channels = ch3
httpagent.sources.httpsource.port = port
httpagent.sinks.hdfs-file-sink.type = hdfs
httpagent.sinks.hdfs-file-sink.hdfs.path = hdfs://localhost:8020/flume/events
httpagent.sinks.hdfs-file-sink.hdfs.fileType=DataStream
httpagent.sinks.hdfs-file-sink.hdfs.filePrefix = events-
httpagent.sinks.hdfs-file-sink.hdfs.rollInterval = 30
httpagent.sinks.hdfs-file-sink.channel = ch3
httpagent.channels.ch3.type = memory
请求正在保存在HDFS中。但我想将HTTP标头附加到帖子内容中。我该怎么做?
答案 0 :(得分:0)
您可能需要查看事件序列化程序(https://flume.apache.org/FlumeUserGuide.html#event-serializers)
可以按照https://flume.apache.org/FlumeUserGuide.html#hdfs-sink
中的说明设置序列化程序如果您需要标题,则应使用Avro Event Serializer或编写自定义标题。