PHP pThreads - 你如何执行垃圾收集?

时间:2014-04-28 12:32:26

标签: php multithreading memory-leaks garbage-collection pthreads

鉴于以下代码,如何确保已完成的MyWorker对象被销毁/其内存被释放?

由于我的脚本需要我需要~50个线程不断从cURL获取数据并进行处理。

我已尝试让线程永远不会离开run(),或者如此示例代码所示,他们离开运行并让collect函数生成它们的新副本。

但不管我在一分钟左右后达到内存限制。你能告诉我我做错了什么吗?

class MyWorker extends Threaded
{
    public $complete;
    public function __construct() {$this->complete = false;}
    public function run() {$this->complete = true;}
}

$pool = new Pool(50);
for($i=0; $i<50; $i++)
    $pool->submit(new MyWorker());
$pool->collect(function($worker)
{
    global $pool;
    if($worker->complete == true)
        $pool->submit(new MyWorker());
    return $worker->complete;
});
$pool->shutdown();

1 个答案:

答案 0 :(得分:8)

为什么

我为什么要收集?

pthreads提供的Worker线程要求程序员保留对正在执行的Threaded个对象的正确引用。程序员很难在userland中可靠地实现,因此pthreads提供了Pool Workers的抽象,它为你维护了引用。

为了维护那些引用,pthreads需要知道对象何时是垃圾,它为此提供了Pool::collect接口。 Pool::collect接受一个Closure,它应该接受一个Threaded对象,如果传递的对象完成执行,则返回布尔值true

如何

手头的任务......

为了继续提交执行任务而不是耗尽资源,您必须创建一个已完成任务的队列,以便重新提交Pool

以下代码演示了一种理智的方法:

<?php

define("LOG", Mutex::create());
/* thread safe log to stdout */
function slog($message, $args = []) {
    $args = func_get_args();
    if (($message = array_shift($args))) {
        Mutex::lock(LOG);
        echo vsprintf(
            "{$message}\n", $args);
        Mutex::unlock(LOG);
    }
}

class Request extends Threaded {
    public function __construct($url) {
        $this->url = $url;
    }

    public function run() {
        $response = @file_get_contents($this->url);

        slog("%s returned %d bytes",
            $this->url, strlen($response));

        $this->reQueue();
    }

    public function getURL()        { return $this->url; }

    public function isQueued()      { return $this->queued; }
    public function reQueue()       { $this->queued = true; }

    protected $url;
    protected $queued = false;
}

/* create a pool of 50 threads */
$pool = new Pool(50);

/* submit 50 requests for execution */
while (@$i++<50) {
    $pool->submit(new Request(sprintf(
        "http://google.com/?q=%s", md5($i))));
}

do {
    $queue = array();

    $pool->collect(function($request) use ($pool, &$queue) {
        /* check for items to requeue */
        if ($request->isQueued()) {
            /* get the url for the request, insert into queue */
            $queue[] = 
                $request->getURL();
            /* allow this job to be collected */
            return true;
        }
    });

    /* resubmit completed tasks to pool */
    if (count($queue)) {
        foreach ($queue as $queued)
            $pool->submit(new Request($queued));
    }

    /* sleep for a couple of seconds here ... because, be nice ! */
    usleep(2.5 * 1000000);
} while (true);
?>