如何将文件写入Kafka Producer

时间:2015-10-22 04:47:32

标签: file apache-kafka kafka-consumer-api kafka-producer-api

我正在尝试在Kafka中加载一个简单的文本文件而不是标准输入。 下载Kafka后,我执行了以下步骤:

启动了zookeeper:

  

bin/zookeeper-server-start.sh config/zookeeper.properties

已启动服务器

  

bin/kafka-server-start.sh config/server.properties

创建了一个名为" test":

的主题
  

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

跑到制片人:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 
Test1
Test2

听取消费者的意见:

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
Test1
Test2

而不是标准输入,我想将数据文件甚至简单的文本文件传递给Producer,消费者可以直接看到它。真的很感激任何帮助。谢谢!

4 个答案:

答案 0 :(得分:58)

你可以把它管道输入:

IEnumerable<B>

找到here

从0.9.0:

kafka-console-producer.sh --broker-list localhost:9092 --topic my_topic
--new-producer < my_file.txt

答案 1 :(得分:6)

No config file found, using default configuration
************* Module sample3
C:  1, 0: Missing module docstring (missing-docstring)
C:  1, 0: Missing function docstring (missing-docstring)
E:  4, 0: Undefined variable 'WALLACE_SP1' (undefined-variable)
E:  4,12: Undefined variable 'i' (undefined-variable)
E:  6, 8: Undefined variable 'K_SPECIES1' (undefined-variable)
E:  6,19: Undefined variable 'i' (undefined-variable)
E:  7, 8: Undefined variable 'LAI_SPECIES1' (undefined-variable)
E:  7,21: Undefined variable 'i' (undefined-variable)
E:  8, 8: Undefined variable 'SP_HEIGHT' (undefined-variable)
E:  8,18: Undefined variable 'i' (undefined-variable)
E: 11, 8: Undefined variable 'K_SPECIES2' (undefined-variable)
E: 11,19: Undefined variable 'i' (undefined-variable)
E: 12, 8: Undefined variable 'LAI_SPECIES2' (undefined-variable)
E: 12,21: Undefined variable 'i' (undefined-variable)
E: 13, 8: Undefined variable 'SP_HEIGHT' (undefined-variable)
E: 13,18: Undefined variable 'i' (undefined-variable)

在Kafka-0.9.0中为我工作

答案 2 :(得分:3)

以下几种方法稍微宽泛一些,但对于简单文件可能有点过分

<强>尾

tail -n0 -F my_file.txt | kafka-console-producer.sh --broker-list localhost:9092 --topic my_topic

说明

  1. tail从文件末尾读取,或者连续添加日志
  2. -n0表示outputlast 0行,因此只选择了新行
  3. -F按名称跟随文件而不是描述符,因此即使它被旋转也能正常工作
  4. <强> syslog-ng的

    options {                                                                                                                             
        flush_lines (0);                                                                                                                
        time_reopen (10);                                                                                                               
        log_fifo_size (1000);                                                                                                          
        long_hostnames (off);                                                                                                           
        use_dns (no);                                                                                                                   
        use_fqdn (no);                                                                                                                  
        create_dirs (no);                                                                                                               
        keep_hostname (no);                                                                                                             
    };
    
    source s_file {
        file("path to my-file.txt" flags(no-parse));
    }
    
    
    destination loghost {
        tcp("*.*.*.*" port(5140));
    } 
    

    消耗

    nc -k -l 5140 | kafka-console-producer.sh --broker-list localhost:9092 --topic my_topic

    解释(来自man nc

    -k' Forces nc to stay listening for another connection after its current connection is completed. It is an error to use this option without the -l option.
    
    -l' Used to specify that nc should listen for an incoming connection rather than initiate a connection to a remote host. It is an error to use this option in conjunction with the -p, -s, or -z options. Additionally, any timeouts specified with the -w option are ignored.
    

    参考

    Syslog-ng

答案 3 :(得分:1)

echo "Hello" | kafka-console-producer.sh --broker-list localhost:9092 --topic my_topic