Apache-Apex:如何将数据从kafka主题写入hdfs文件系统?

时间:2018-07-15 11:56:34

标签: java apache-kafka hdfs apache-apex

我试图从kafka主题中读取dat并将其写入HDFS文件系统,我使用[https://github.com/apache/apex-malhar/tree/master/examples/kafka]中的apex malhar来构建我的项目。 不幸的是,在设置kafka属性和hadoop配置之后,未在我的hdfs 2.6.0系统中创建数据。 PS:控制台没有显示任何错误,并且一切似乎都正常工作

此处是我用于我的应用的代码

public class TestConsumer {
    public static void main(String[] args) {
        Consumer consumerThread = new Consumer(KafkaProperties.TOPIC);
        consumerThread.start();
        ApplicationTest a = new ApplicationTest();
         try {
            a.testApplication();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

这里是Apal malhar的ApplicationTest类示例

package org.apache.apex.examples.kafka.kafka2hdfs;

import org.apache.log4j.Logger;
import javax.validation.ConstraintViolationException;

import org.junit.Rule;


import org.apache.apex.malhar.kafka.AbstractKafkaInputOperator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.net.NetUtils;

import com.datatorrent.api.LocalMode;

import info.batey.kafka.unit.KafkaUnitRule;




/**
 * Test the DAG declaration in local mode.
 */

public class ApplicationTest
{
  private static final Logger LOG = Logger.getLogger(ApplicationTest.class);
  private static final String TOPIC = "kafka2hdfs";

  private static final int zkPort = NetUtils.getFreeSocketPort();
  private static final int brokerPort = NetUtils.getFreeSocketPort();
  private static final String BROKER = "localhost:" + brokerPort;
  private static final String FILE_NAME = "test";
  private static final String FILE_DIR = "./target/tmp/FromKafka";



  // broker port must match properties.xml
  @Rule
  private static  KafkaUnitRule kafkaUnitRule = new KafkaUnitRule(zkPort, brokerPort);

  public void testApplication() throws Exception
  {
    try {
      // run app asynchronously; terminate after results are checked
      LocalMode.Controller lc = asyncRun();


      lc.shutdown();
    } catch (ConstraintViolationException e) {
        LOG.error("constraint violations: " + e.getConstraintViolations());

    }
  }
  private Configuration getConfig()
  {
    Configuration conf = new Configuration(false);
    String pre = "dt.operator.kafkaIn.prop.";
    conf.setEnum(pre + "initialOffset", AbstractKafkaInputOperator.InitialOffset.EARLIEST);
    conf.setInt(pre + "initialPartitionCount", 1);
    conf.set(pre + "topics", TOPIC);
    conf.set(pre + "clusters", BROKER);

    pre = "dt.operator.fileOut.prop.";
    conf.set(pre + "filePath", FILE_DIR);
    conf.set(pre + "baseName", FILE_NAME);
    conf.setInt(pre + "maxLength", 40);
    conf.setInt(pre + "rotationWindows", 3);

    return conf;
  }

  private LocalMode.Controller asyncRun() throws Exception
  {
    Configuration conf = getConfig();
    LocalMode lma = LocalMode.newInstance();
    lma.prepareDAG(new KafkaApp(), conf);
    LocalMode.Controller lc = lma.getController();
    lc.runAsync();
    return lc;
  }
}

1 个答案:

答案 0 :(得分:0)

在runAsync之后和关闭之前,您需要等待预期的结果(否则DAG将立即退出)。实际上就是example中发生的事情。