Flume正在截断字符当我使用源类型作为记录器时 - 它只显示前20个字符并忽略其余字符

时间:2013-11-25 10:12:58

标签: flume

这是我的测试配置(使用netcat+记录器作为控制台)

\#START OF CONFIG FILE

\#Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

\# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 4444

\# Describe the sink
a1.sinks.k1.type = logger

\#Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

\# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

\#====END OF CONFIG FILE

现在我发出以下命令来使用我的特定配置:

$bin/flume-ng agent --conf conf --conf-file conf/netcat_dump.conf  --name a1 -Dflume.root.logger=DEBUG,console

使用netcat命令并输入以下文本:

$netcat localhost 4444 

这是第一个通过netcat发送到水槽的事件

现在,如果您查看Flume Console,您会看到截断的日志行。

2013-11-25 15:33:20,862  ---- Event: { headers:{} body: 54 68 69 73 20 69 73 20 46 69 72 73 
74 20 45 76 **This is First Ev** }
2013-11-25 15:33:20,862  ---- Events processed = 1

注意:我尝试过大多数通道参数,但没有帮助。 请帮助我解决这个问题!

提前致谢。

2 个答案:

答案 0 :(得分:1)

您的输出正常工作,因为默认记录器接收器会将正文内容截断为16个字节。我不相信您可以在不创建自己的自定义LoggerSink的情况下覆盖此行为,因为当前的LoggerSink没有任何配置参数。我修改了下面的现有LoggerSink,并将其命名为AdvancedLoggerSink(有点用词不当,因为它不是那么先进)。

高级记录器接收器添加一个名为maxBytes的配置参数,您可以使用该参数设置输出的日志消息量。默认值仍然是16个字节,但您现在可以用您想要的任何内容覆盖它。如果将其设置为0,则会打印整个日志消息。

要实现此功能,您需要下载flume二进制文件,然后使用AdvancedLoggerSink类创建一个JAR文件。在编译和创建jar文件时,您需要包含以下水槽罐,这些罐位于水槽二进制下载的lib目录中:

  • flume-ng-configuration-1.4.0.jar
  • 水槽-NG-芯1.4.0.jar
  • flume-ng-sdk-1.4.0.jar
  • SLF4J-API-1.6.1.jar

假设您创建了一个名为advancedLoggerSink.jar的jar文件,那么您可以将其放入名为lib的目录中的flume插件目录中。 plugins目录默认为$FLUME_HOME/plugins.d,但您可以在任何地方创建它。您的目录结构应如下所示:

plugins.d/advanced-logger-sink/lib/advancedLoggerSink.jar

(确保将jar放在名为'lib'的目录中。有关插件目录布局的更多信息,请参阅flume用户指南http://flume.apache.org/FlumeUserGuide.html

要运行水槽代理,请使用以下命令:

flume-ng agent --plugins-path /path/to/your/plugins.d --conf /conf/directory --conf-file /conf/logger.flume --name a1 -Dflume.root.logger=INFO,console

注意我是如何指定plugins-path(plugins.d目录所在的路径)的。 Flume会自动加载plugins.d目录中的advancedLoggerSink。

这是AdvancedLoggerSink类:

import org.apache.flume.Channel;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.Sink;
import org.apache.flume.Transaction;
import org.apache.flume.conf.Configurable;
import org.apache.flume.event.EventHelper;
import org.apache.flume.sink.AbstractSink;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class AdvancedLoggerSink extends AbstractSink implements Configurable {

    private static final int defaultMaxBytes = 16;

    private int maxBytesProp;

    private static final Logger logger = LoggerFactory
            .getLogger(AdvancedLoggerSink.class);

    @Override
    public void configure(Context context) {
        // maxBytes of 0 means to log the entire event
        int maxBytesProp = context.getInteger("maxBytes", defaultMaxBytes);
        if (maxBytesProp < 0) {
            maxBytesProp = defaultMaxBytes;
        }

        this.maxBytesProp = maxBytesProp;
    }

    @Override
    public Status process() throws EventDeliveryException {
        Status result = Status.READY;
        Channel channel = getChannel();
        Transaction transaction = channel.getTransaction();
        Event event = null;

        try {
            transaction.begin();
            event = channel.take();

            if (event != null) {
                if (logger.isInfoEnabled()) {
                    logger.info("Event: " + EventHelper.dumpEvent(
                                    event,
                                    this.maxBytesProp == 0 ? event.getBody().length : this.maxBytesProp
                                ));
                }
            } else {
                // No event found, request back-off semantics from the sink
                // runner
                result = Status.BACKOFF;
            }
            transaction.commit();
        } catch (Exception ex) {
            transaction.rollback();
            throw new EventDeliveryException("Failed to log event: " + event,
                    ex);
        } finally {
            transaction.close();
        }

        return result;
    }
}

您的配置文件应如下所示:

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = AdvancedLoggerSink
# maxBytes is the maximum number of bytes to output for the body of the event
# the default is 16 bytes. If you set maxBytes to 0 then the entire record will
# be output.  
a1.sinks.k1.maxBytes = 0

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

答案 1 :(得分:1)

Sarus编写的上述答案除了将a1.sinks.k1.type更改为应包含软件包名称的整个类路径名外。此外,对于Flume 1.6.0,将已编译的jar复制到已安装的flume路径下的lib文件夹中。您也可以使用System.out.pritnln而不是使用日志。像下面的东西

if(event!=null){
          System.out.println(EventHelper.dumpEvent(event,event.getBody().length));
          status = Status.READY; 
      }else{
          System.out.println("Event is null");
          status = Status.BACKOFF; 
      }