部署为Pod的Apache点燃节点使用TcpDiscoveryKubernetesIpFinder相互发现,但无法通信,因此不会加入同一集群。
我使用“官方”教程在Azure上为基于ignite的应用程序设置了kubernetes部署。至此,部署已成功完成,但每个吊舱的拓扑中始终只有一台服务器。当我直接登录Pod并尝试连接到Pod 47500上的另一个Pod时,它不起作用。更有趣的是,端口47500仅在当前容器上的127.0.01上访问而不使用其外部IP。
以下是pod /节点1上的调试消息。如您所见,TcpDiscoveryKubernetesIpFinder发现了两个点燃的pod /节点。但是它无法连接到另一个点火节点:
INFO [org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi] (ServerService Thread Pool -- 5) Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
DEBUG [org.apache.ignite.internal.managers.communication.GridIoManager] (ServerService Thread Pool -- 5) Starting SPI: TcpCommunicationSpi [connectGate=null, connPlc=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$FirstConnectionPolicy@48ca2359, enableForcibleNodeKill=false, enableTroubleshootingLog=false, locAddr=null, locHost=0.0.0.0/0.0.0.0, locPort=47100, locPortRange=100, shmemPort=-1, directBuf=true, directSndBuf=false, idleConnTimeout=600000, connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=0, slowClientQueueLimit=0, nioSrvr=GridNioServer [selectorSpins=0, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@30a29315, directMode=true], GridConnectionBytesVerifyFilter], closed=false, directBuf=true, tcpNoDelay=true, sockSndBuf=32768, sockRcvBuf=32768, writeTimeout=2000, idleTimeout=600000, skipWrite=false, skipRead=false, locAddr=0.0.0.0/0.0.0.0:47100, order=LITTLE_ENDIAN, sndQueueLimit=0, directMode=true, sslFilter=null, msgQueueLsnr=null, readerMoveCnt=0, writerMoveCnt=0, readWriteSelectorsAssign=false], shmemSrv=null, usePairedConnections=false, connectionsPerNode=1, tcpNoDelay=true, filterReachableAddresses=false, ackSndThreshold=32, unackedMsgsBufSize=0, sockWriteTimeout=2000, boundTcpPort=47100, boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, ctxInitLatch=java.util.concurrent.CountDownLatch@4186e275[Count = 1], stopping=false]
DEBUG [org.apache.ignite.internal.managers.communication.GridIoManager] (ServerService Thread Pool -- 5) Starting SPI implementation: org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi
DEBUG [org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi] (ServerService Thread Pool -- 5) Using parameter [locAddr=null]
DEBUG [org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi] (ServerService Thread Pool -- 5) Using parameter [locPort=47100]
DEBUG [org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] Grid runnable started: tcp-disco-srvr
DEBUG [org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder] (ServerService Thread Pool -- 5) Getting Apache Ignite endpoints from: https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/default/endpoints/ignite
DEBUG [org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder] (ServerService Thread Pool -- 5) Added an address to the list: 10.244.0.93
DEBUG [org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder] (ServerService Thread Pool -- 5) Added an address to the list: 10.244.0.94
ERROR [org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread Pool -- 5) Exception on direct send: Invalid argument (connect failed): java.net.ConnectException: Invalid argument (connect failed)
at java.net.PlainSocketImpl.socketConnect(Native Method)
我直接登录了Pod,并尝试在其他节点/ pod上执行ping操作,但echo > /dev/tcp/10.244.0.93/47500
或echo > /dev/tcp/10.244.0.94/47500
均无效。
另一方面,echo > /dev/tcp/127.0.0.1/47500
会这样做。这使我认为ignite只是在侦听本地环回地址。
pod /节点2上有类似的日志
这是kubernetes配置
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pgdata
namespace: default
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ignite
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: ignite
namespace: default
rules:
- apiGroups:
- ""
resources:
- pods
- endpoints
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: ignite
roleRef:
kind: ClusterRole
name: ignite
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: ignite
namespace: default
---
apiVersion: v1
kind: Service
metadata:
name: ignite
namespace: default
spec:
clusterIP: None # custom value.
ports:
- port: 9042 # custom value.
selector:
type: processing-engine-node
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: database-tenant-1
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: database-tenant-1
template:
metadata:
labels:
app: database-tenant-1
spec:
containers:
- name: database-tenant-1
image: postgres:12
env:
- name: "POSTGRES_USER"
value: "admin"
- name: "POSTGRES_PASSWORD"
value: "admin"
- name: "POSTGRES_DB"
value: "tenant1"
volumeMounts:
- name: pgdata
mountPath: /var/lib/postgresql/data
subPath: postgres
ports:
- containerPort: 5432
readinessProbe:
exec:
command: ["psql", "-W", "admin", "-U", "admin", "-d", "tenant1", "-c", "SELECT 1"]
initialDelaySeconds: 15
timeoutSeconds: 2
livenessProbe:
exec:
command: ["psql", "-W", "admin", "-U", "admin", "-d", "tenant1", "-c", "SELECT 1"]
initialDelaySeconds: 45
timeoutSeconds: 2
volumes:
- name: pgdata
persistentVolumeClaim:
claimName: pgdata
---
apiVersion: v1
kind: Service
metadata:
name: database-tenant-1
namespace: default
labels:
app: database-tenant-1
spec:
type: NodePort
ports:
- port: 5432
selector:
app: database-tenant-1
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: processing-engine-master
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: processing-engine-master
template:
metadata:
labels:
app: processing-engine-master
type: processing-engine-node
spec:
serviceAccountName: ignite
initContainers:
- name: check-db-ready
image: postgres:12
command: ['sh', '-c',
'until pg_isready -h database-tenant-1 -p 5432;
do echo waiting for database; sleep 2; done;']
containers:
- name: xxxx-engine-master
image: shostettlerprivateregistry.azurecr.io/xxx/xxx-application:4.2.5
ports:
- containerPort: 8081
- containerPort: 11211 # REST port number.
- containerPort: 47100 # communication SPI port number.
- containerPort: 47500 # discovery SPI port number.
- containerPort: 49112 # JMX port number.
- containerPort: 10800 # SQL port number.
- containerPort: 10900 # Thin clients port number.
volumeMounts:
- name: config-volume
mountPath: /opt/project-postgres.yml
subPath: project-postgres.yml
volumes:
- name: config-volume
configMap:
name: pe-config
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: processing-engine-worker
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: processing-engine-worker
template:
metadata:
labels:
app: processing-engine-worker
type: processing-engine-node
spec:
serviceAccountName: ignite
initContainers:
- name: check-db-ready
image: postgres:12
command: ['sh', '-c',
'until pg_isready -h database-tenant-1 -p 5432;
do echo waiting for database; sleep 2; done;']
containers:
- name: xxx-engine-worker
image: shostettlerprivateregistry.azurecr.io/xxx/xxx-worker:4.2.5
ports:
- containerPort: 8081
- containerPort: 11211 # REST port number.
- containerPort: 47100 # communication SPI port number.
- containerPort: 47500 # discovery SPI port number.
- containerPort: 49112 # JMX port number.
- containerPort: 10800 # SQL port number.
- containerPort: 10900 # Thin clients port number.
volumeMounts:
- name: config-volume
mountPath: /opt/project-postgres.yml
subPath: project-postgres.yml
volumes:
- name: config-volume
configMap:
name: pe-config
和点火配置
<bean id="tcpDiscoveryKubernetesIpFinder" class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder"/>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="localPort" value="47500" />
<property name="localAddress" value="127.0.0.1" />
<property name="networkTimeout" value="10000" />
<property name="ipFinder">
<bean id="tcpDiscoveryKubernetesIpFinder" class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder"/>
</property>
</bean>
</property>
我希望Pod能够进行通信,并最终获得以下拓扑拓扑快照:
[ver=1, locNode=a8e6a058, servers=2, clients=0, state=ACTIVE, CPUs=2, offheap=0.24GB, heap=1.5GB]
答案 0 :(得分:1)
您将发现配置为绑定到本地主机:
<property name="localAddress" value="127.0.0.1" />
这意味着来自不同容器的节点将无法相互连接。尝试从配置中删除此行。