我的用于Neo4j的Kafka接收器无法加载

时间:2017-08-14 18:13:31

标签: java neo4j apache-kafka-connect

简介

首先,我要为我的问题中的任何含糊不清道歉,我将尝试尽可能多地提供有关此主题的信息(希望不要太多),如果我应该提供更多信息,请告诉我。同样,我对卡夫卡很陌生,可能会偶然发现术语。

因此,根据我对接收器和源如何工作的理解,我可以使用Kafka Quickstart指南提供的FileStreamSourceConnector将数据(Neo4j命令)写入Kafka集群中保存的主题。然后我可以编写自己的Neo4j接收器和任务来读取这些命令并将它们发送到一个或多个Neo4j服务器。为了使项目尽可能简单,目前,我将接收器和任务基于Kafka Quickstart指南的FileStreamSinkConnector和FileStreamSinkTask。

Kafka的FileStream:

FileStreamSourceConnector

FileStreamSourceTask

FileStreamSinkConnector

FileStreamSinkTask

我的Neo4j水槽连接器:

package neo4k.sink;

import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigDef.Importance;
import org.apache.kafka.common.config.ConfigDef.Type;
import org.apache.kafka.common.utils.AppInfoParser;
import org.apache.kafka.connect.connector.Task;
import org.apache.kafka.connect.sink.SinkConnector;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Neo4jSinkConnector extends SinkConnector {

    public enum Keys {
        ;
        static final String URI = "uri";
        static final String USER = "user";
        static final String PASS = "pass";
        static final String LOG = "log";
    }

    private static final ConfigDef CONFIG_DEF = new ConfigDef()
            .define(Keys.URI, Type.STRING, "", Importance.HIGH, "Neo4j URI")
            .define(Keys.USER, Type.STRING, "", Importance.MEDIUM, "User Auth")
            .define(Keys.PASS, Type.STRING, "", Importance.MEDIUM, "Pass Auth")
            .define(Keys.LOG, Type.STRING, "./neoj4sinkconnecterlog.txt", Importance.LOW, "Log File");

    private String uri;
    private String user;
    private String pass;
    private String logFile;

    @Override
    public String version() {
        return AppInfoParser.getVersion();
    }

    @Override
    public void start(Map<String, String> props) {
        uri = props.get(Keys.URI);
        user = props.get(Keys.USER);
        pass = props.get(Keys.PASS);
        logFile = props.get(Keys.LOG);
    }

    @Override
    public Class<? extends Task> taskClass() {
        return Neo4jSinkTask.class;
    }

    @Override
    public List<Map<String, String>> taskConfigs(int maxTasks) {
        ArrayList<Map<String, String>> configs = new ArrayList<>();
        for (int i = 0; i < maxTasks; i++) {
            Map<String, String> config = new HashMap<>();
            if (uri != null)
                config.put(Keys.URI, uri);
            if (user != null)
                config.put(Keys.USER, user);
            if (pass != null)
                config.put(Keys.PASS, pass);
            if (logFile != null)
                config.put(Keys.LOG, logFile);
            configs.add(config);
        }
        return configs;
    }

    @Override
    public void stop() {
    }

    @Override
    public ConfigDef config() {
        return CONFIG_DEF;
    }
}

我的Neo4j水槽任务:

package neo4k.sink;

import org.apache.kafka.clients.consumer.OffsetAndMetadata;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.connect.sink.SinkRecord;
import org.apache.kafka.connect.sink.SinkTask;
import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;
import org.neo4j.driver.v1.StatementResult;
import org.neo4j.driver.v1.exceptions.Neo4jException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.Collection;
import java.util.Map;

public class Neo4jSinkTask extends SinkTask {

    private static final Logger log = LoggerFactory.getLogger(Neo4jSinkTask.class);

    private String uri;
    private String user;
    private String pass;
    private String logFile;

    private Driver driver;
    private Session session;

    public Neo4jSinkTask() {
    }

    @Override
    public String version() {
        return new Neo4jSinkConnector().version();
    }

    @Override
    public void start(Map<String, String> props) {
        uri = props.get(Neo4jSinkConnector.Keys.URI);
        user = props.get(Neo4jSinkConnector.Keys.USER);
        pass = props.get(Neo4jSinkConnector.Keys.PASS);
        logFile = props.get(Neo4jSinkConnector.Keys.LOG);

        driver = null;
        session = null;

        try {
            driver = GraphDatabase.driver(uri, AuthTokens.basic(user, pass));
            session = driver.session();
        } catch (Neo4jException ex) {
            log.trace(ex.getMessage(), logFilename());
        }
    }

    @Override
    public void put(Collection<SinkRecord> sinkRecords) {
        StatementResult result;
        for (SinkRecord record : sinkRecords) {
            result = session.run(record.value().toString());
            log.trace(result.toString(), logFilename());
        }
    }

    @Override
    public void flush(Map<TopicPartition, OffsetAndMetadata> offsets) {
    }

    @Override
    public void stop() {
        if (session != null)
            session.close();
        if (driver != null)
            driver.close();
    }

    private String logFilename() {
        return logFile == null ? "stdout" : logFile;
    }
}

问题:

在写完之后,我接下来构建了包括它有的所有依赖项,不包括任何Kafka依赖项,到jar(Or Uber Jar?它是一个文件)。然后我编辑了connect-standalone.properties中的插件路径,以包含该工件,并为我的Neo4j接收器连接器编写了一个属性文件。我这样做是为了尝试遵循这些guidelines

我的Neo4j接收器连接器属性文件:

name=neo4k-sink

connector.class=neo4k.sink.Neo4jSinkConnector

tasks.max=1

uri=bolt://localhost:7687

user=neo4j

pass=Hunter2

topics=connect-test

但是在运行独立程序时,我在关闭流的输出中出现此错误(第5行出错):

[2017-08-14 12:59:00,150] INFO Kafka version : 0.11.0.0 (org.apache.kafka.common.utils.AppInfoParser:83)
[2017-08-14 12:59:00,150] INFO Kafka commitId : cb8625948210849f (org.apache.kafka.common.utils.AppInfoParser:84)
[2017-08-14 12:59:00,153] INFO Source task WorkerSourceTask{id=local-file-source-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSourceTask:143)
[2017-08-14 12:59:00,153] INFO Created connector local-file-source (org.apache.kafka.connect.cli.ConnectStandalone:91)
[2017-08-14 12:59:00,153] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:100)
java.lang.IllegalArgumentException: Malformed \uxxxx encoding.
    at java.util.Properties.loadConvert(Properties.java:574)
    at java.util.Properties.load0(Properties.java:390)
    at java.util.Properties.load(Properties.java:341)
    at org.apache.kafka.common.utils.Utils.loadProps(Utils.java:429)
    at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:84)
[2017-08-14 12:59:00,156] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
[2017-08-14 12:59:00,156] INFO Stopping REST server (org.apache.kafka.connect.runtime.rest.RestServer:154)
[2017-08-14 12:59:00,168] INFO Stopped ServerConnector@540accf4{HTTP/1.1}{0.0.0.0:8083} (org.eclipse.jetty.server.ServerConnector:306)
[2017-08-14 12:59:00,173] INFO Stopped o.e.j.s.ServletContextHandler@6d548d27{/,null,UNAVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler:865)

编辑:我应该提一下,在连接器加载的部分输出声明添加了哪些插件时,我没有看到我之前构建的jar并且在connect-standalone中创建了一个路径。属性。这是上下文的片段:

[2017-08-14 12:58:58,969] INFO Added plugin 'org.apache.kafka.connect.file.FileStreamSinkConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132)
[2017-08-14 12:58:58,969] INFO Added plugin 'org.apache.kafka.connect.tools.MockSourceConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132)
[2017-08-14 12:58:58,969] INFO Added plugin 'org.apache.kafka.connect.tools.VerifiableSourceConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132)
[2017-08-14 12:58:58,969] INFO Added plugin 'org.apache.kafka.connect.tools.VerifiableSinkConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132)
[2017-08-14 12:58:58,970] INFO Added plugin 'org.apache.kafka.connect.tools.MockConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:132)

结论:

我很茫然,我已经做了大约几个小时的测试和研究,我认为我不确定要问什么问题。所以如果你已经走到这一步,我会说谢谢你的阅读。如果你发现任何明显的错误,我可能在代码或方法中做错了(例如打包jar),或者认为我应该提供更多的上下文或控制台日志或任何真正让我知道的东西。再次谢谢你。

1 个答案:

答案 0 :(得分:0)

正如@Randall Hauch所指出的,我的属性文件中包含隐藏的字符,因为它是一个富文本文档。我通过复制Kafka提供的connect-file-sink.properties文件来解决这个问题,我相信这只是一个普通的文本文档。然后重命名并编辑我的neo4j接收器属性的副本。