我想在配置为flumemaleemployee
和flumefemaleemployee
的两个单独表中的配置单元仓库目录中写入数据。最后一个3 records
应该插入female
表中,上面的3 records
应该插入male
表中。以下是我的数据:
1,alok,mumbai
1,jatin,chennai
1,yogesh,kolkata
2,ragini,delhi
2,jyotsana,pune
1,valmiki,banglore
下面是我的flume
conf
代码:
agent.sources = tailsrc
agent.channels = mem1 mem2
agent.sinks = stdl std2
agent.sources.tailsrc.type = exec
agent.sources.tailsrc.command = tail -F /home/cloudera/Desktop/in.txt
agent.sources.tailsrc.batchSize = 1
agent.sources.tailsrc.interceptors = i1
agent.sources.tailsrc.interceptors.i1.type = regex_extractor
agent.sources.tailsrc.interceptors.il.regex = A(\\d}
agent.sources.tailsrc. interceptors. M.serializers = t1
agent.sources.tailsrc. interceptors, i1.serializers.t1. name = type
agent.sources.tailsrc.selector.type = multiplexing
agent.sources.tailsrc.selector.header = type
agent.sources.tailsrc.selector.mapping.1 = mem1
agent.sources.tailsrc.selector.mapping.2 = mem2
agent.sinks.std1.type = hdfs
agent.sinks.stdl.channel = mem1
agent.sinks.stdl.batchSize = 1
agent.sinks.std1.hdfs.path = /user/hive/warehouse/aisehibanayatp.db/flumemaleemployee
agent.sinks.stdl.rolllnterval = 0
agent.sinks.stdl.hdfs.fileType = DataStream
agent.sinks.std2.type = hdfs
agent.sinks.std2.channel = mem2
agent.sinks.std2.batchSize = 1
agent.sinks.std2.hdfs.path = /user/hi ve/warehouse/aisehibanayatp.db/flumefemaleemployee
agent.sinks.std2.rolllnterval = 0
agent.sinks.std2.hdfs.fileType = DataStream
agent.channels.mem1.type = memory
agent.channels.meml.capacity = 100
agent.channels.mem2.type = memory
agent.channels.mem2.capacity = 100
agent.sources.tailsrc.channels = mem1 mem2
我没有收到任何错误,但是当我使用以下命令启动flume
service
时,它卡在了我不知道如何处理的内容上,因为我没有收到任何错误>
flume-ng agent --name agent -conf-file /home/cloudera/Desktop/flume1.config
并停留在以下步骤:
18/11/13 08:03:00 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: CHANNEL, name: mem2. channel.event.take.success == 0
18/11/13 08:03:00 INFO node.Application: Starting new configuration:{ sourceRunners:{} sinkRunners:{std2=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@17ade71c counterGroup:{ name:null counters:{} } }} channels:{mem2=org.apache.flume.channel.MemoryChannel{name: mem2}} }
18/11/13 08:03:00 INFO node.Application: Starting Channel mem2
18/11/13 08:03:00 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: mem2 started
18/11/13 08:03:00 INFO node.Application: Starting Sink std2
18/11/13 08:03:00 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: std2: Successfully registered new MBean.
18/11/13 08:03:00 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: std2 started
那我该怎么实现呢?
答案 0 :(得分:0)
问题是拼写错误,缺少格式和空格,我用l代替了1。我设法解决了这些问题,并且运行了,我修改了您的正则表达式,可以对其进行调整,但其中大多数是准确性问题。如下使用文件,并且可以使用您自己的HDFS和设置:
agent.sources = tailsrc
agent.channels = mem1 mem2
agent.sinks = std1 std2
agent.sources.tailsrc.type = exec
agent.sources.tailsrc.command = tail -F /home/cloudera/in.txt
agent.sources.tailsrc.batchSize = 1
agent.sources.tailsrc.interceptors = i1
agent.sources.tailsrc.interceptors.i1.type = regex_extractor
agent.sources.tailsrc.interceptors.i1.regex = ^.*(1|2)
agent.sources.tailsrc.interceptors.i1.serializers = t1
agent.sources.tailsrc.interceptors.i1.serializers.t1.name = type
agent.sources.tailsrc.selector.type = multiplexing
agent.sources.tailsrc.selector.header = type
agent.sources.tailsrc.selector.mapping.1 = mem1
agent.sources.tailsrc.selector.mapping.2 = mem2
agent.sinks.std1.type = hdfs
agent.sinks.std1.channel = mem1
agent.sinks.std1.batchSize = 1
agent.sinks.std1.hdfs.path = hdfs://quickstart.cloudera:8020/user/hive/warehouse/flumemaleemployee
agent.sinks.std1.rolllnterval = 0
agent.sinks.std1.hdfs.fileType = DataStream
agent.sinks.std2.type = hdfs
agent.sinks.std2.channel = mem2
agent.sinks.std2.batchSize = 1
agent.sinks.std2.hdfs.path = hdfs://quickstart.cloudera:8020/user/hive/warehouse/flumefemaleemployee
agent.sinks.std2.rolllnterval = 0
agent.sinks.std2.hdfs.fileType = DataStream
agent.channels.mem1.type = memory
agent.channels.meml.capacity = 100
agent.channels.mem2.type = memory
agent.channels.mem2.capacity = 100
agent.sources.tailsrc.channels = mem1 mem2