有没有办法限制Ray对象存储的最大内存使用量

时间:2019-09-03 04:57:34

标签: python ray

我试图利用Ray的并行化模型来逐条记录地处理文件。代码工作得很漂亮,但是对象存储增长很快,最终崩溃了。我避免使用ray.get(function.remote()),因为它会降低性能,因为该任务由数百万个子任务组成,而且还有等待任务完成的开销。有没有办法设置对象存储的全局限制?

#code which constantly backpressusre the obejct storage, freeing space, but causes performance to be worse than serial execution
for record in infile:
    ray.get(createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename))

#code that maximizes throughput but makes the object storage grow constantly
for record in infile:
    createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename)

#the called function returns either 0 or 1.

1 个答案:

答案 0 :(得分:0)

您可以执行ray.init(object_store_memory=10**9)来限制对象存储使用1GB。

https://ray.readthedocs.io/en/latest/memory-management.html中有关内存管理的文档中有更多信息。