从刮下的文本中删除空格

时间:2015-10-21 21:41:16

标签: php web web-scraping fopen

$url = 'MyUrl';

$contents = file_get_contents($url); 

function scrape_between($data, $start, $end){
    $data = stristr($data, $start); 
    $data = substr($data, strlen($start));
    $stop = stripos($data, $end);
    $data = substr($data, 0, $stop);
    return $data;
}

$svetaines_turinys = trim(scrape_between($contents, "<table border=\"0\" cellspacing=\"0\">", "</table>"));

$fp = fopen("autogidas.php", "w+"); 

fwrite ($fp, "$svetaines_turinys"); 

fclose ($fp); 

$fh = fopen("autogidas.php", 'r') or die("negalima atidaryti");

while(! feof($fh)) {

    $visa_data1 = fgets($fh);

    $visa_data = trim($visa_data1);

    $pavadinimas = trim(scrape_between($visa_data, "<span class=\"ttitle2\">", "</span>"));
    $metai = trim(scrape_between($visa_data, "<span class=\"ttitle1\">", "</span>"));
    $kaina = trim(scrape_between($visa_data, "<span class=\"ttitle1\" style='float: left;'>", "<br /><span class=\"grey\">"));

    echo "$pavadinimas<br> $metai <br> $kaina . <br><br>";
}

fclose($fh);

输出工作正常,但问题是具有大量可用空间的输出,我尝试使用trim(),但它没有解决问题。

2 个答案:

答案 0 :(得分:0)

你可以使用正则表达式完成这项任务,这样的事情将完美地运作:

$metai = preg_replace('/\s+/', ' ',scrape_between($visa_data, "<span class=\"ttitle1\">", "</span>"));

只需对每个具有相同问题的var进行操作。

答案 1 :(得分:0)

如果您的意思是要删除多个空格并留下一个空格,可以使用str_replace()这样的

function scrape_between($data, $start, $end){
    $data = stristr($data, $start); 
    $data = substr($data, strlen($start));
    $stop = stripos($data, $end);
    $data = substr($data, 0, $stop);
    return str_replace('  ', ' ', $data);
}