我正在尝试设置一个基本的Java使用者来接收来自Kafka主题的消息。我在 - https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example处跟踪了示例 - 并获得了此代码:
package org.example.kafka.client;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
public class KafkaClientMain
{
private final ConsumerConnector consumer;
private final String topic;
private ExecutorService executor;
public KafkaClientMain(String a_zookeeper, String a_groupId, String a_topic)
{
this.consumer = kafka.consumer.Consumer.createJavaConsumerConnector(
createConsumerConfig(a_zookeeper, a_groupId));
this.topic = a_topic;
}
private static ConsumerConfig createConsumerConfig(String a_zookeeper, String a_groupId) {
Properties props = new Properties();
props.put("zookeeper.connect", a_zookeeper);
props.put("group.id", a_groupId);
props.put("zookeeper.session.timeout.ms", "1000");
props.put("zookeeper.sync.time.ms", "1000");
props.put("auto.commit.interval.ms", "1000");
props.put("auto.offset.reset", "smallest");
return new ConsumerConfig(props);
}
public void shutdown() {
if (consumer != null) consumer.shutdown();
if (executor != null) executor.shutdown();
}
public void run(int a_numThreads) {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, new Integer(a_numThreads));
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
System.out.println( "streams.size = " + streams.size() );
// now launch all the threads
//
executor = Executors.newFixedThreadPool(a_numThreads);
// now create an object to consume the messages
//
int threadNumber = 0;
for (final KafkaStream stream : streams) {
executor.submit(new ConsumerTest(stream, threadNumber));
threadNumber++;
}
}
public static void main(String[] args)
{
String zooKeeper = "ec2-whatever.compute-1.amazonaws.com:2181";
String groupId = "group1";
String topic = "test";
int threads = 1;
KafkaClientMain example = new KafkaClientMain(zooKeeper, groupId, topic);
example.run(threads);
}
}
和
package org.example.kafka.client;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
public class ConsumerTest implements Runnable
{
private KafkaStream m_stream;
private int m_threadNumber;
public ConsumerTest(KafkaStream a_stream, int a_threadNumber)
{
m_threadNumber = a_threadNumber;
m_stream = a_stream;
}
public void run()
{
System.out.println( "calling ConsumerTest.run()" );
ConsumerIterator<byte[], byte[]> it = m_stream.iterator();
while (it.hasNext())
{
System.out.println("Thread " + m_threadNumber + ": " + new String(it.next().message()));
}
System.out.println("Shutting down Thread: " + m_threadNumber);
}
}
Kafka正在有问题的EC2主机上运行,我可以使用kafka-console-producer.sh和kafka-console-consumer.sh工具发送和接收有关“test”主题的消息。端口2181是打开的,并且可以从消费者运行的机器上获得(对于良好的测量,9092也是如此,但这似乎也没有帮助)。
不幸的是,我在运行此消息时从未收到消费者的任何消息。当消费者正在运行时,既没有关于该主题的现有消息,也没有使用kafka-console-producer.sh发送的新发送的消息。
这是使用在CentOS 6.4 x64上运行的Kafka 0.8.1.1,使用OpenJDK 1.7.0_65。
编辑:FWIW,当消费者程序启动时,我看到这个Zookeeper输出:
[2014-08-01 15:56:38,045] INFO Accepted socket connection from /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
[2014-08-01 15:56:38,049] INFO Client attempting to establish new session at /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
[2014-08-01 15:56:38,053] INFO Established session 0x1478e963fb30008 with negotiated timeout 6000 for client /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
知道这可能会发生什么吗?非常感谢任何和所有帮助。
答案 0 :(得分:11)
为了子孙后代自己回答这个问题,以防其他人遇到类似的问题。
问题在于:Kafka经纪人和Zookeeper在EC2节点上,消费者在我的笔记本电脑上运行。连接到Zookeeper时,客户端获得了对“ip-10-0-x-x.ec2.internal”的引用,该引用无法从EC2外部解析(默认情况下)。一旦我在客户端上正确配置了log4j,这就变得清晰了,所以我得到了所有的日志消息。
解决方法是在我的/ etc / hosts文件中放入一个条目,将ec2内部主机名映射到可公共路由的IP地址。
答案 1 :(得分:3)
您可以使用位于kafka配置文件夹下的server.properties文件中设置以下属性来解决此问题
advertised.host.name = Ec2服务器的公共dns