Ray中的错误:“ ModuleNotFoundError:没有名为'pandas'的模块”

时间:2020-10-14 08:35:15

标签: python pandas ray

我在名为p_c的环境中的终端上启动ray,该环境中安装了pandas命令 ray start --head --num-cpus = 2 --num-gpus = 0

然后,我运行了以下python脚本:

import ray
import os
import pandas as pd
import sys

ray.init(address='auto', redis_password='5241590000000000')

@ray.remote
def foo():
    import pandas as pd
    print("This runs on the VM")
    print(os.getcwd())
    print(sys.path)
    data = pd.read_csv('/Documents/sample.data')
    
    return 1

print("This runs locally")
print(ray.get(foo.remote()))

运行此命令会引发以下错误:

WARNING: Logging before InitGoogleLogging() is written to STDERR
    I1014 13:56:23.410329 16563 16563 global_state_accessor.cc:25] Redis server address = 192.168.29.24:6379, is test flag = 0
    I1014 13:56:23.411886 16563 16563 redis_client.cc:146] RedisClient connected.
    I1014 13:56:23.421353 16563 16563 redis_gcs_client.cc:89] RedisGcsClient Connected.
    I1014 13:56:23.423465 16563 16563 service_based_gcs_client.cc:193] Reconnected to GCS server: 192.168.29.24:37125
    I1014 13:56:23.424247 16563 16563 service_based_accessor.cc:92] Reestablishing subscription for job info.
    I1014 13:56:23.424291 16563 16563 service_based_accessor.cc:422] Reestablishing subscription for actor info.
    I1014 13:56:23.424387 16563 16563 service_based_accessor.cc:797] Reestablishing subscription for node info.
    I1014 13:56:23.424415 16563 16563 service_based_accessor.cc:1073] Reestablishing subscription for task info.
    I1014 13:56:23.424441 16563 16563 service_based_accessor.cc:1248] Reestablishing subscription for object locations.
    I1014 13:56:23.424466 16563 16563 service_based_accessor.cc:1368] Reestablishing subscription for worker failures.
    I1014 13:56:23.424504 16563 16563 service_based_gcs_client.cc:86] ServiceBasedGcsClient Connected.
    This runs locally
    Traceback (most recent call last):
      File "hello1.py", line 26, in <module>
        print(ray.get(foo.remote()))
      File "/home/jatin/.local/lib/python3.8/site-packages/ray/worker.py", line 1538, in get
        raise value.as_instanceof_cause()
    ray.exceptions.RayTaskError(ModuleNotFoundError): ray::__main__.foo() (pid=16182, ip=192.168.29.24)
      File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task
      File "hello1.py", line 17, in foo
        import pandas as pd
    ModuleNotFoundError: No module named 'pandas'

我在所有可能的路径上都安装了熊猫。我不明白工人到底在哪里找不到熊猫模块。 没有熊猫导入,代码运行良好。

1 个答案:

答案 0 :(得分:0)

Ray 运行时将在配置的虚拟环境中寻找 Pandas。如果在本地启动 Ray,请确保在为 Ray 运行时提供服务的虚拟环境中安装所需的 Python 库。

例如

. .venv/bin/activate
pip install pandas
ray start --num-cpus=8 --object-store-memory=7000000000 --head