在Kubernets集群上使用ray和Jupyterhub

时间:2019-07-02 16:16:18

标签: kubernetes jupyterhub ray

我正在用Jupyterhub和Ray构建Kubernetes集群,希望用户访问Jupyterhub并在k8s上使用Ray集群。我的计划是使用Jupyterhub笔记本“ https://ray.readthedocs.io/en/latest/api.html”中的Ray API中断Ray集群。

kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                                          AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP                                          6d19h
ray-head     ClusterIP   10.100.19.93   <none>        6379/TCP,6380/TCP,6381/TCP,12345/TCP,12346/TCP   4d21h

但是,当我跑步

import ray
ray.init(redis_address="10.100.19.93:6379")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-f25708b1f128> in <module>
----> 1 ray.init(redis_address="10.100.19.93:6379")

/opt/conda/lib/python3.7/site-packages/ray/worker.py in init(redis_address, num_cpus, num_gpus, resources, object_store_memory, redis_max_memory, log_to_driver, node_ip_address, object_id_seed, local_mode, redirect_worker_output, redirect_output, ignore_reinit_error, num_redis_shards, redis_max_clients, redis_password, plasma_directory, huge_pages, include_webui, driver_id, configure_logging, logging_level, logging_format, plasma_store_socket_name, raylet_socket_name, temp_dir, load_code_from_local, _internal_config)
   1434             load_code_from_local=load_code_from_local)
   1435         _global_node = ray.node.Node(
-> 1436             ray_params, head=False, shutdown_at_exit=False, connect_only=True)
   1437 
   1438     connect(

/opt/conda/lib/python3.7/site-packages/ray/node.py in __init__(self, ray_params, head, shutdown_at_exit, connect_only)
    100             redis_client = self.create_redis_client()
    101             self.session_name = ray.utils.decode(
--> 102                 redis_client.get("session_name"))
    103 
    104         self._init_temp(redis_client)

/opt/conda/lib/python3.7/site-packages/ray/utils.py in decode(byte_str, allow_none)
    175     if not isinstance(byte_str, bytes):
    176         raise ValueError(
--> 177             "The argument {} must be a bytes object.".format(byte_str))
    178     if sys.version_info >= (3, 0):
    179         return byte_str.decode("ascii")

ValueError: The argument None must be a bytes object.

我想知道我的方法是否正确,以及如何解决错误。

1 个答案:

答案 0 :(得分:0)

只要您不在群集内运行ray.init(redis_address="10.100.19.93:6379"),就必须根据群集运行的位置通过LoadBalancerNodePort公开射线头服务。

有关enter image description here的更多详细信息

因此,请kubectl edit svc ray-head并进行更改

type: ClusterIP

type: NodePort

完成后,尝试ray.init(redis_address="<node-ip-address>:<node-port>")