我一直试图在k8s集群上设置RabbitMQ,我终于完成了所有设置,但是在managementUI上只显示了一个节点。这是我的步骤:
我这样做是为了启用autocluster:
FROM rabbitmq:3.8-rc-management-alpine
MAINTAINER kevlai
RUN rabbitmq-plugins --offline enable rabbitmq_peer_discovery_k8s
apiVersion: v1
kind: ServiceAccount
metadata:
name: borecast-rabbitmq
namespace: borecast-production
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: borecast-rabbitmq
namespace: borecast-production
rules:
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: borecast-rabbitmq
namespace: borecast-production
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: dev
subjects:
- kind: ServiceAccount
name: borecast-rabbitmq
namespace: borecast-production
apiVersion: v1
kind: Secret
metadata:
name: rabbitmq-secret
namespace: borecast-production
type: Opaque
data:
username: a2V2
password: Ym9yZWNhc3RydWx6
secretCookie: c2VjcmV0Y29va2llaGVyZQ==
我正在设置StorageClass
,因此k8s会自动在AWS上为我进行配置。
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: rabbitmq-sc
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
zone: us-east-2a
reclaimPolicy: Retain
您可以看到有两项服务。无头服务是针对吊舱本身的。至于管理服务,我将为Ingress控制器公开该服务,以便可以从外部对其进行访问。
---
apiVersion: v1
kind: Service
metadata:
name: borecast-rabbitmq-management-service
namespace: borecast-production
labels:
app: borecast-rabbitmq
spec:
ports:
- port: 15672
targetPort: 15672
name: http
- port: 5672
targetPort: 5672
name: amqp
selector:
app: borecast-rabbitmq
---
apiVersion: v1
kind: Service
metadata:
name: borecast-rabbitmq-service
namespace: borecast-production
labels:
app: borecast-rabbitmq
spec:
clusterIP: None
ports:
- port: 5672
name: amqp
selector:
app: borecast-rabbitmq
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: borecast-rabbitmq
namespace: borecast-production
spec:
serviceName: borecast-rabbitmq-service
replicas: 3
template:
metadata:
labels:
app: borecast-rabbitmq
spec:
serviceAccountName: borecast-rabbitmq
containers:
- image: docker.borecast.com/borecast-rabbitmq:v1.0.3
name: borecast-rabbitmq
imagePullPolicy: Always
resources:
requests:
memory: "256Mi"
cpu: "150m"
limits:
memory: "512Mi"
cpu: "250m"
ports:
- containerPort: 5672
name: amqp
env:
- name: RABBITMQ_DEFAULT_USER
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: username
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: password
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: secretCookie
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_SERVICE_NAME
# value: borecast-rabbitmq-service.borecast-production.svc.cluster.local
value: borecast-rabbitmq-service
- name: RABBITMQ_USE_LONGNAME
value: "true"
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME)"
# value: rabbit@$(MY_POD_NAME).borecast-rabbitmq-service.borecast-production.svc.cluster.local
- name: RABBITMQ_NODE_TYPE
value: disc
- name: AUTOCLUSTER_TYPE
value: "k8s"
- name: AUTOCLUSTER_DELAY
value: "10"
- name: AUTOCLUSTER_CLEANUP
value: "true"
- name: CLEANUP_WARN_ONLY
value: "false"
- name: K8S_ADDRESS_TYPE
value: "hostname"
- name: K8S_HOSTNAME_SUFFIX
value: ".$(K8S_SERVICE_NAME)"
# value: .borecast-rabbitmq-service.borecast-production.svc.cluster.local
volumeMounts:
- name: rabbitmq-volume
mountPath: /var/lib/rabbitmq
imagePullSecrets:
- name: regcred
volumeClaimTemplates:
- metadata:
name: rabbitmq-volume
namespace: borecast-production
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: rabbitmq-sc
resources:
requests:
storage: 5Gi
一切正常。但是,当我访问管理UI(即,我访问borecast-rabbitmq-management-service
,端口15672)时,我只看到一个节点出现,应该是三个:
还要注意,群集名称为
rabbit@borecast-rabbitmq-0.borecast-rabbitmq-service.borecast-production.svc.cluster.local
但是当我注销并再次登录时,有时数字0会更改为1
的{{1}}或2
。
还要注意,节点名称是
borecast-rabbitmq-0
您猜到了,有时数字是rabbit@borecast-rabbitmq-1.borecast-rabbitmq-service
或2
的数字。
我一直在尝试调试,但无济于事。每个Pod的日志不会引起任何怀疑,并且每个服务和statefulset都可以正常工作。我多次重复了五个步骤,如果您的集群在AWS上,则可以按照以下步骤完全复制我的设置(当然是在创建名称空间0
之后)。如果有人可以阐明这件事,我将永远感激不已。
答案 0 :(得分:0)
问题在于无头服务名称定义:
- name: K8S_SERVICE_NAME
# value: borecast-rabbitmq-service.borecast-production.svc.cluster.local
value: borecast-rabbitmq-service
这是节点名称的组成部分:
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME)"
正确的节点名称应为POD的FQDN(<statefulset name>-<ordinal index>.<headless_svc_name>.<namespace>.svc.cluster.local
):
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local"
因此,您最终获得了NodeName
borecast-rabbitmq-1.borecast-rabbitmq-service
代替:
borecast-rabbitmq-1.borecast-rabbitmq-service.borecast-production.svc.cluster.local
按照here的说明,使用nslookup util在群集内部查找borcast-rabbitmq StatefulSet创建的pod的fqdn(换句话说:Pod的SRV记录),以了解RABBITMQ_NODENAME是什么形式预期会有。
答案 1 :(得分:0)
尝试为无头服务暴露 4369;