如何过滤刮刀的输出? (PHP)

时间:2016-09-07 09:21:01

标签: php

我正在废弃一个页面,我得到了这个结果:

string(1) " "
string(15) " +0,25 pist.wit"
string(14) " +0,25 pist.br"
// and so on...

但我想要一个这样的结果:

0,25
0,25
//and so on...

所以从技术上讲,我想过滤价格(没有+标志)和 面包名称(pist.wit等) 有人知道怎么做吗?这是我的代码:

 public function onRun()
{
    $client = new Client();
    $crawler = $client->request('GET', 'http://www.sandwich-express.nl/online-bestellen/');
    $crawler->filter('tr')->each(function ($node) {
        if(sizeof($node->filter('.table-spacing')) > 0)
            var_dump('nieuwe headers next TR');
        $node->filter('tr.colomn_text td')->each(function ($node) {
            var_dump($node->text());
        });
    });

}

2 个答案:

答案 0 :(得分:1)

我认为您的意思是您将价格和名称设为2个值,如下所示。

public function onRun()
{
    $client = new Client();
    $crawler = $client->request('GET', 'http://www.sandwich-express.nl/online-bestellen/');
    $crawler->filter('tr')->each(function ($node) {
        if(sizeof($node->filter('.table-spacing')) > 0)
            var_dump('nieuwe headers next TR');
        $node->filter('tr.colomn_text td')->each(function ($node) {
            $name = trim($node->text());
            $price = 0;
            if(0 === strpos($name, '+')) {
                $names = explode(' ', $name);
                $price = floatval(str_replace(['+', ','], ['', '.'], array_shift($names)));
                $name = implode(' ', $names);
            }
            var_dump($price, $name);
        });
    });
}

结果:

int(0)
string(0) ""
float(0.25)
string(7) "pist.br"
int(0)
string(7) "bol wit"

答案 1 :(得分:0)

http://simplehtmldom.sourceforge.net/

简单的HTML DOM库,用于按标记获取HTML标记。它很适合根据您的需要获取HTML标记值或类名

实施例

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all images
foreach($html->find('img') as $element)
       echo $element->src . '<br>';

// Find all links
foreach($html->find('a') as $element)
       echo $element->href . '<br>';