我正在尝试使用存储作为Cassandra来运行Janusgraph,在同一集群中作为另一项服务运行,并在Elasticsearch中进行索引,然后在同一集群中又作为另一项服务运行。
虽然在这两个服务中都需要打开端口,但是janusgraph pods的日志显示在连接到Cassandra时其面临的连接超时。
23343 [main] WARN org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] configured at [conf/gremlin-server/janusgraph.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:82)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:70)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.lambda$new$0(DefaultGraphManager.java:57)
at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.<init>(DefaultGraphManager.java:55)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:110)
at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:89)
at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:354)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:78)
... 13 more
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477)
at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:409)
at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1376)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:113)
... 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
... 24 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:619)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(AstyanaxStoreManager.java:314)
... 29 more
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=cassandra(SERVICE_IP):9160, latency=10001(10001), attempts=1]Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:352)
at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:294)
at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:614)
我正在为cassandra运行janusgrah v2图像和gcr.io/google-samples/cassandra:v13图像。
我也尝试从busybox pod连接到cassandra端口9160。但似乎不起作用。
但是有趣的是:ping
似乎适用于服务名称(此处为Cassandra)。但是只有当它到达端口9160或9042上的telnet
时,我才会收到连接被拒绝的错误。
这是cassandra STS:
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
name: cql
- port: 9160
name: thrift
selector:
app: cassandra
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
#schedulerName: stork #Check benefits of using STORK as scheduler.
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
- containerPort: 9160
name: thrift
- containerPort: 9142
name: transportssl
resources:
limits:
cpu: "1Gi"
memory: 2Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: CASSANDRA_SEEDS
value: cassandra-0.cassandra.default.svc.cluster.local
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 512M
- name: CASSANDRA_CLUSTER_NAME
value: "Cassandra"
- name: CASSANDRA_DC
value: "DC1"
- name: CASSANDRA_RACK
value: "Rack1"
- name: CASSANDRA_AUTO_BOOTSTRAP
value: "false"
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
- name: CASSANDRA_RPC_ADDRESS
value: 0.0.0.0
- name: CASSANDRA_NUM_TOKENS
value: "32"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: nfs-pvc-cassandra
mountPath: /srv/nfs/kubedata/janus
restartPolicy: Always
volumes:
- name: nfs-pvc-cassandra
persistentVolumeClaim:
claimName: nfs-pvc-cassandra
我可以进一步调试它的方式是什么?
答案 0 :(得分:0)
如果janusgraph在主机上运行,则可能必须对kubernetes服务端口进行端口转发才能在本地访问它。也许您已经做到了
答案 1 :(得分:0)
正如我可以确认的那样,您的StatefulSet yaml可以正常工作,并且无头服务会创建指向Pod端点的dns名称。我创建了简单的nginx pod到telnet以进行检查。输出如下:
检查cassandra是否存在端点和服务
$ kubectl get svc,ep cassandra
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cassandra ClusterIP None <none> 9042/TCP,9160/TCP 2h
NAME ENDPOINTS AGE
endpoints/cassandra 10.56.0.10:9160,10.56.3.3:9160,10.56.4.2:9160 + 3 more... 2h
在同一个命名空间中执行到邻居容器,并在telnet中执行服务和容器
$ kubectl exec -it nginx-79dbd67896-9dwp8 bash
root@nginx-79dbd67896-9dwp8:/# telnet cassandra 9042
Trying 10.56.3.3...
Connected to cassandra.default.svc.cluster.local.
Escape character is '^]'.
telnet> quit
Connection closed.
root@nginx-79dbd67896-9dwp8:/# telnet 10.56.0.10 9042
Trying 10.56.0.10...
Connected to 10.56.0.10.
Escape character is '^]'.
从输出服务看来,pod正在侦听端口9042,而不是9160,因为端口9160用于Cassandra的Thrift API服务器,该服务器默认情况下处于禁用状态。有关此问题的更多信息,请检查https://github.com/docker-library/cassandra/issues/127。您必须检查如何启用Thrift API端口。
您可以通过执行以下命令之一来检查cassandra容器上的侦听端口:
root@cassandra-0:/# apt update && apt install telnet net-tools
<output omitted>
root@cassandra-0:/# netstat -tulpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 0.0.0.0:9042 0.0.0.0:* LISTEN 1000 10163244 -
tcp 0 0 127.0.0.1:43669 0.0.0.0:* LISTEN 1000 10162974 -
tcp 0 0 10.56.0.10:7000 0.0.0.0:* LISTEN 1000 10163145 -
tcp 0 0 127.0.0.1:7199 0.0.0.0:* LISTEN 1000 10162973 -
希望有帮助!
答案 2 :(得分:0)
只需更新janusgraph Storage.hostname: cassandra-0.cassandra.default
中的values.yaml
,以使janusgraph pod与cassandra通信即可。
使用nodetool命令在cassandra节点上检查节俭状态 nodetool statusThrift
如果未启用,则使用nodetool命令(nodetool enablethrift)再次启用它