JanusGraph,Spark集群未能连接到Cassandra

时间:2018-04-12 06:54:11

标签: apache-spark cassandra gremlin janusgraph

我正在尝试在创建JanusGraph的集群上运行Spark作业。

我有一台JanusGraph服务器实例,Cassandra,ES在一台机器上运行,只有Spark计算在集群上运行。 (基本上,我在机器上做了janusgraph.sh start

我的配置如下(x是我运行上述实例的机器的IP):

def getGraph(): JanusGraph = {
    val config = JanusGraphFactory.build()
    config.set("storage.backend", "cassandrathrift")
    config.set("storage.cassandrathrift.keyspace", "jgex")
    config.set("storage.hostname", "x")
    config.set("index.jgex.backend", "elasticsearch")
    config.set("index.jgex.index-name", "jgex")
    config.set("jgex.hostname", "x")
    config.open()
  }

但是当我在集群上做了spark-submit胖罐时,我明白了:

    java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftStoreManager
    at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
    at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477)
    at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:409)
    at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1376)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:123)
    at org.janusgraph.core.JanusGraphFactory$Builder.open(JanusGraphFactory.java:264)
    at janus_create$.getGraph(janus_create.scala:66)
    at janus_create$.makePropertiesandIndexes(janus_create.scala:830)
    at janus_create$.main(janus_create.scala:921)
    at janus_create.main(janus_create.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
    ... 16 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
    at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:219)
    at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftStoreManager.<init>(CassandraThriftStoreManager.java:198)
    ... 21 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
    at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
    at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeRawConnection(CTConnectionFactory.java:110)
    at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:74)
    at org.janusgraph.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:43)
    at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1179)
    at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:216)
    ... 22 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
    ... 28 more

我尝试在cassandra和cassandrathrift之间切换,但两者都不起作用。另外,我在哪里指定我的gremlin运行的位置。这有关系吗?

1 个答案:

答案 0 :(得分:2)

预先打包的发行版假定每个Cassandra,Elasticsearch和Gremlin Server都有一个localhost节点。注意堆栈跟踪底部的java.net.ConnectException: Connection refused。如果在远程群集上运行Spark,则需要确保服务器在非本地主机地址上可用。

  • 首先使用bin/janusgraph.sh stop
  • 停止发布
  • 使用计算机的IP地址(Cassandra docs

  • 更新listen_address中的rpc_address$JANUSGRAPH_HOME/conf/cassandra/cassandra.yaml
  • 使用计算机的IP地址(Elasticsearch docs)将network.host添加到$JANUSGRAPH_HOME/elasticsearch/config/elasticsearch.yml

  • 使用机器的IP地址(JanusGraph docs)更新host中的$JANUSGRAPH_HOME/conf/gremlin-server/gremlin-server.yaml

  • 假设您使用的是默认的Gremlin Server配置gremlin-server.yaml,则需要使用计算机的IP地址更新$JANUSGRAPH_HOME/conf/gremlin-server/janusgraph-cassandra-es-server.properties中的属性文件。使用机器的IP地址更新storage.hostnameindex.search.hostname,与上面的服务器设置相匹配。

更新:您的图表连接属性中似乎存在可能导致此问题的错误:

  • config.set("storage.cassandrathrift.keyspace", "jgex") - 应为"storage.cassandra.keyspace"
  • config.set("jgex.hostname", "x") - 应为"index.jgex.hostname"