在PHP7中使用Pool类pthreads扩展

时间:2016-11-28 21:54:49

标签: php pthreads php-7

我参加了使用Pool类(这个演示https://github.com/krakjoe/pthreads#polyfill)的pthreads PHP7扩展的最基本演示,并将其扩展了一点,以便我可以从线程中获取结果(或者至少我认为我可以):

$pool = new Pool(4);

foreach (range(1, 8) as $i) {
    $pool->submit(new class($i) extends Threaded
    {
        public $i;
        private $garbage = false;

        public function __construct($i)
        {
            $this->i = $i;
        }

        public function run()
        {
            echo "Hello World\n";
            $this->result = $this->i * 2;
            $this->garbage = true;
        }

        public function isGarbage() : bool
        {
            return $this->garbage;
        }
    });
}

while ($pool->collect(function(Collectable $task) {
    if ($task->isGarbage()) {
        echo $task->i . ' ' . $task->result . "\n";
    }
    return $task->isGarbage();
})) continue;

$pool->shutdown();

令我感到困惑的是,它有时无法获得所有任务的结果:

Hello World
Hello World
Hello World
Hello World
Hello World
1 2
2 4
Hello World
Hello World
3 6
Hello World
7 14
4 8
8 16

现在缺少5 106 12的两行,但我不明白为什么。这种情况有时只发生(可能是1/10运行)。

看起来原始演示适用于pthreads的旧版本,因为Collectable接口现在由Threaded自动实现,如果我没弄错的话。

然后自述文件说:

  

Pool :: collect机制已从Pool移动到Worker,以获得更强大的Worker和更简单的Pool继承。

所以我猜我做错了。

编辑:我从How does Pool::collect works?获取示例并更新它以使用最新的pthread和当前的PHP7,但结果是一样的。看起来它无法从最后执行的线程中收集结果。

$pool = new Pool(4);

while (@$i++<10) {
    $pool->submit(new class($i) extends Thread implements Collectable {
        public $id;
        private $garbage;

        public function __construct($id) {
            $this->id = $id;
        }

        public function run() {
            sleep(1);
            printf(
                "Hello World from %d\n", $this->id);
            $this->setGarbage();
        }

        public function setGarbage() {
            $this->garbage = true;
        }

        public function isGarbage(): bool {
            return $this->garbage;
        }

    });
}

while ($pool->collect(function(Collectable $work){
    printf(
        "Collecting %d\n", $work->id);
    return $work->isGarbage();
})) continue;

$pool->shutdown();

这输出以下显然没有收集所有线程:

Hello World from 1
Collecting 1
Hello World from 2
Collecting 2
Hello World from 3
Collecting 3
Hello World from 4
Collecting 4
Hello World from 5
Collecting 5
Hello World from 6
Hello World from 7
Collecting 6
Collecting 7
Hello World from 8
Hello World from 9
Hello World from 10

2 个答案:

答案 0 :(得分:1)

正如您已经非常正确地指出的那样,您复制的代码目标是pthreads v2(对于PHP 5.x)。

问题归结为pthreads 中的垃圾收集器不是确定性的。这意味着它不会以可预测的方式运行,因此无法可靠地使用它来从池执行的任务中获取数据。

您可以获取此数据的一种方法是将Threaded个对象传递到提交到池中的任务中:

<?php

$pool = new Pool(4);
$data = [];

foreach (range(1, 8) as $i) {
    $dataN = new Threaded();
    $dataN->i = $i;

    $data[] = $dataN;

    $pool->submit(new class($dataN) extends Threaded {
        public $data;

        public function __construct($data)
        {
            $this->data = $data;
        }

        public function run()
        {
            echo "Hello World\n";
            $this->data->i *= 2;
        }
    });
}

while ($pool->collect());

$pool->shutdown();

foreach ($data as $dataN) {
    var_dump($dataN->i);
}

有关上述代码的注意事项:

  • Collectable(现在是pthreads v3中的接口)已由Threaded类实现,因此无需自行实现。
  • 一旦任务被提交到池中,它就被认为是垃圾,因此不需要自己处理这个部分。虽然您仍然能够覆盖默认的垃圾收集器,但在绝大多数情况下(包括您的)都不需要这样做。
  • 我仍然调用collect方法(在一个循环中阻塞主线程,直到所有任务都完成执行),以便可以对任务进行垃圾收集(使用pthreads&#39;默认收集器)来释放内存而游泳池正在执行任务。

答案 1 :(得分:0)

我有一个类似的问题,即收集将立即返回true。事实证明,collect将在所有工作完成后in process返回,而不是在所有工作完成后返回。它甚至无法处理任务,因此collecting从未返回。

因此,如果我的池大小为4,并且仅提交了3个任务,则collect将永远不会运行,我们将立即继续。示例:

define ("CRLF", "\r\n");

class AsyncWork extends Thread {
  private $done = false;
  private $id;

  public function __construct($id) {
    $this->id = $id;
  }

  public function id() {
    return $this->id;
  }

  public function isCompleted() {
    return $this->done;
  }

  public function run() {
    echo '[AsyncWork] ' . $this->id . CRLF;
    sleep(rand(1,5));
    echo '[AsyncWork] sleep done ' . $this->id . CRLF;
    $this->done = true;
  }
}

$pool = new Pool(4);

for($i=1;$i<=3;$i++) {
  $pool->submit(new AsyncWork($i));
}

while ($pool->collect(function(AsyncWork $work){
    echo 'Collecting ['.$work->id().']: ' . ($work->isCompleted()?1:0) . CRLF;
    return $work->isGarbage();
})) continue;

echo 'ALL DONE' . CRLF;

$pool->shutdown();

将输出

[AsyncWork] 1
[AsyncWork] 2
ALL DONE
[AsyncWork] 3
[AsyncWork] sleep done 2
[AsyncWork] sleep done 3
[AsyncWork] sleep done 1

如果我将上面的代码更改为要进行更多的工作,然后再达到poolsize,它将一直收集到所有工作都在进行中。 EG:

for($i=1;$i<=10;$i++) {
  $pool->submit(new AsyncWork($i));
}

//results:

[AsyncWork] 1
[AsyncWork] 2
[AsyncWork] 3
[AsyncWork] 4
[AsyncWork] sleep done 4
[AsyncWork] 8
Collecting [4]: 1
[AsyncWork] sleep done 1
Collecting [1]: 1
[AsyncWork] 5
[AsyncWork] sleep done 3
Collecting [3]: 1
[AsyncWork] 7
[AsyncWork] sleep done 2
Collecting [2]: 1
[AsyncWork] 6
[AsyncWork] sleep done 6
Collecting [6]: 1
[AsyncWork] 10
[AsyncWork] sleep done 7
Collecting [7]: 1
[AsyncWork] sleep done 8
Collecting [8]: 1
[AsyncWork] sleep done 5
Collecting [5]: 1
ALL DONE
[AsyncWork] 9
[AsyncWork] sleep done 9
[AsyncWork] sleep done 10

如您所见,它从不收集最后的任务,并且在完成工作之前返回。

我唯一可以解决此问题的方法是通过跟踪任务列表来处理自己。

$pool = new Pool(4);

$worklist = [];
for($i=1;$i<=10;$i++) {
  $work = new AsyncWork($i);
  $worklist[] = $work;
  $pool->submit($work);
}

do {
  $alldone = true;
  foreach($worklist as $i=>$work) {
    if (!$work->isCompleted()) {
      $alldone = false;
    } else {
      echo 'Completed: '. $work->id(). CRLF;
      unset($worklist[$i]);
    }
  }

  if ($alldone) {
    break;
  }
} while(true);

while ($pool->collect(function(AsyncWork $work){
    echo 'Collecting ['.$work->id().']: ' . ($work->isCompleted()?1:0) . CRLF;
    return $work->isGarbage();
})) continue;

echo 'ALL DONE' . CRLF;

$pool->shutdown();

这是我确保ALL DONE仅在实际上完成时才被调用的唯一方法。

[AsyncWork] 1
[AsyncWork] 2
[AsyncWork] 3
[AsyncWork] 4
[AsyncWork] sleep done 1
[AsyncWork] 5
Completed: 1
[AsyncWork] sleep done 2
Completed: 2
[AsyncWork] 6
[AsyncWork] sleep done 4
[AsyncWork] 8
Completed: 4
[AsyncWork] sleep done 6
[AsyncWork] sleep done 3
[AsyncWork] 7
Completed: 6
Completed: 3
[AsyncWork] sleep done 5
Completed: 5
[AsyncWork] 10
[AsyncWork] 9
[AsyncWork] sleep done 9
Completed: 9
[AsyncWork] sleep done 8
Completed: 8
[AsyncWork] sleep done 7
Completed: 7
[AsyncWork] sleep done 10
Completed: 10
Collecting [1]: 1
Collecting [5]: 1
Collecting [9]: 1
Collecting [2]: 1
Collecting [6]: 1
Collecting [10]: 1
Collecting [3]: 1
Collecting [7]: 1
Collecting [4]: 1
Collecting [8]: 1
ALL DONE