Issue with trying to use Python's multiprocessing Queue with RAY

时间:2019-05-19 04:05:12

标签: python queue multiprocessing ray

I would like to use RAY's actor to spawn workers & each worker will use Python's multiprocessing package to spawn a process to generate random integer data. Each worker will then store the data in it's own local queue or a shared global queue. The queues are implemented with Python's multiprocessing package.

I have confirmed that the workers are spawned & generating data correctly. The issue here is to get them to store data in the queues.

I know I can simply use only Python's multiprocessing package to achieve what I want but I would really like to know how to use Python's multiprocessing Queues with Ray's actor correctly. The codes were run in Google's Colab.

I have provided 3 pieces of code, A, B & C. A runs with no issues, spawns workers & generate data correctly, no queues were used. B includes only local worker queues & C includes only a global queue without any local worker queues. Both B & C produce errors.

A runs with no issues, spawns workers & generate data correctly, no queues were used:

import numpy as np
import ray, time
from multiprocessing import Process

@ray.remote
class Worker(object):
  def __init__(self, w_id):
    self.w_id = w_id
    self.process = Process(target=self._generate_data, args=())
    self.process.start() 

  def _generate_data(self):
    while True:
      data = np.random.randint(1,4)
      time.sleep(1)
      print(self.w_id, data)
    return self.w_id, data

if __name__ == '__main__':  
  ray.init(ignore_reinit_error=True)

  Ws = [Worker.remote(w_id) for w_id in range(2)]
  i = 0
  while True:
    time.sleep(1)
    print(i)
    if i>9:
      break
    i+=1

  ray.shutdown()

B includes only local worker queues(produces error):

import numpy as np
import ray, time    
from multiprocessing import Process, Queue

@ray.remote
class Worker(object):
  def __init__(self, w_id):
    self.w_id = w_id
    self.queue = Queue()
    self.process = Process(target=self._generate_data, args=())
    self.process.start() 

  def _generate_data(self):
    while True:
      data = np.random.randint(1,4)
      self.queue.put(data)
      time.sleep(1)
    return self.w_id   

if __name__ == '__main__':  
  ray.init(ignore_reinit_error=True)

  Ws = [Worker.remote(w_id) for w_id in range(1)]
  i = 0
  while True:
    time.sleep(2)
    if i>9:
      break
    i+=1

  ray.shutdown()

C includes only a global queue without any local worker queues(produces error):

import numpy as np
import ray, time
from multiprocessing import Process, Queue

@ray.remote
class Worker(object):
  def __init__(self, w_id, g_q):
    self.w_id = w_id
    self.queue = g_q
    self.process = Process(target=self._generate_data, args=(g_q))
    self.process.start() 

  def _generate_data(self, g_q):
    while True:
      data = np.random.randint(1,4)
      g_q.put(data)
      time.sleep(1)
      print(self.w_id, data)
    return self.w_id, data

if __name__ == '__main__':  
  ray.init(ignore_reinit_error=True)
  g_q = Queue()
  Ws = [Worker.remote(w_id, g_q) for w_id in range(2)]
  i = 0
  while True:
    time.sleep(1)
    print(i)
    if i>9:
      break
    i+=1

  ray.shutdown()

A's output is correct:

2019-05-19 02:26:42,328 WARNING worker.py:1341 -- WARNING: Not updating worker name since `setproctitle` is not installed. Install this with `pip install setproctitle` (or ray[debug]) to enable monitoring of worker processes.
2019-05-19 02:26:42,330 INFO node.py:497 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-26-42_330364_128/logs.
2019-05-19 02:26:42,442 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:13715 to respond...
2019-05-19 02:26:42,580 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:57570 to respond...
2019-05-19 02:26:42,584 INFO services.py:806 -- Starting Redis shard with 2.58 GB max memory.
2019-05-19 02:26:42,632 INFO node.py:511 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-26-42_330364_128/logs.
2019-05-19 02:26:42,637 INFO services.py:1441 -- Starting the Plasma object store with 3.87 GB memory using /dev/shm.
2019-05-19 02:26:42,761 WARNING actor.py:614 -- Actor is garbage collected in the wrong driver. Actor id = ActorID(75150eec35127d050bf435039332f2bcc43e6a11), class name = Worker.
2019-05-19 02:26:42,765 WARNING actor.py:614 -- Actor is garbage collected in the wrong driver. Actor id = ActorID(6791cde57023d522b51b44b3a3564c7b152beb62), class name = Worker.
0
(pid=801) 0 2
1
(pid=800) 1 2
(pid=801) 0 1
2
(pid=800) 1 3
(pid=801) 0 1
3
(pid=800) 1 2
(pid=801) 0 3
4
(pid=800) 1 1
(pid=801) 0 3
5
(pid=800) 1 1
(pid=801) 0 3
6
(pid=800) 1 1
(pid=801) 0 2
7
(pid=800) 1 3
(pid=801) 0 2
8
(pid=800) 1 2
(pid=801) 0 2
9
(pid=800) 1 3
(pid=801) 0 3
10

B's error message:

2019-05-19 02:21:17,932 WARNING worker.py:1341 -- WARNING: Not updating worker name since `setproctitle` is not installed. Install this with `pip install setproctitle` (or ray[debug]) to enable monitoring of worker processes.
2019-05-19 02:21:17,935 INFO node.py:497 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-21-17_934897_128/logs.
2019-05-19 02:21:18,047 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:42660 to respond...
2019-05-19 02:21:18,184 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:55986 to respond...
2019-05-19 02:21:18,188 INFO services.py:806 -- Starting Redis shard with 2.58 GB max memory.
2019-05-19 02:21:18,231 INFO node.py:511 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-21-17_934897_128/logs.
2019-05-19 02:21:18,234 INFO services.py:1441 -- Starting the Plasma object store with 3.87 GB memory using /dev/shm.
2019-05-19 02:21:18,341 WARNING actor.py:614 -- Actor is garbage collected in the wrong driver. Actor id = ActorID(4171cdedcc979400748671ace92a1bcefb10f213), class name = Worker.
2019-05-19 02:21:23,976 ERROR worker.py:1616 -- Possible unhandled error from worker: ray_worker (pid=320, host=0366b816fc55)
  File "<ipython-input-4-d91015ddd20f>", line 9, in __init__
  File "/usr/lib/python3.6/multiprocessing/context.py", line 101, in Queue
    from .queues import Queue
KeyError: "'__name__' not in globals"

C's error message:

2019-05-19 02:43:09,760 WARNING worker.py:1341 -- WARNING: Not updating worker name since `setproctitle` is not installed. Install this with `pip install setproctitle` (or ray[debug]) to enable monitoring of worker processes.
2019-05-19 02:43:09,763 INFO node.py:497 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-43-09_763458_128/logs.
2019-05-19 02:43:09,888 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:36483 to respond...
2019-05-19 02:43:10,060 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:46985 to respond...
2019-05-19 02:43:10,070 INFO services.py:806 -- Starting Redis shard with 2.58 GB max memory.
2019-05-19 02:43:10,126 INFO node.py:511 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-19_02-43-09_763458_128/logs.
2019-05-19 02:43:10,128 INFO services.py:1441 -- Starting the Plasma object store with 3.87 GB memory using /dev/shm.
2019-05-19 02:43:10,341 WARNING worker.py:342 -- WARNING: Falling back to serializing objects of type <class '_multiprocessing.SemLock'> by using pickle. This may be inefficient.
2019-05-19 02:43:10,348 WARNING worker.py:402 -- WARNING: Serializing the class <class 'multiprocessing.queues.Queue'> failed, so are are falling back to cloudpickle.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/ray/worker.py in put_object(self, object_id, value)
    382         try:
--> 383             self.store_and_register(object_id, value)
    384         except pyarrow.PlasmaObjectExists:

26 frames
TypeError: can't pickle _multiprocessing.SemLock objects

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
/usr/lib/python3.6/multiprocessing/context.py in assert_spawning(obj)
    354         raise RuntimeError(
    355             '%s objects should only be shared between processes'
--> 356             ' through inheritance' % type(obj).__name__
    357             )

RuntimeError: Queue objects should only be shared between processes through inheritance

0 个答案:

没有答案