从输入文件中过滤有用的数据

时间:2015-02-20 10:29:23

标签: php

我有一个相当庞大且非常混乱的数据文件,我希望从中过滤有用的数据。它的结构看起来像这样:

!bla bla
more bla
some useless data
something interesting
 something interesting
 something interesting
some useless data
something interesting
 something interesting
some useless data
bla bla

我的计划是使用file_get_contents()阅读文件,然后使用str_replace()替换一些数据并将其用作标记。接下来,我尝试将文件开头的无用数据移至marker1,然后从marker2移至marker3,然后从marker4移至文件末尾,我只会在输出中获得有用的数据(此时我还不确定我是否需要数据中的标记)。我尝试使用strstr()但无法使其正常工作。

    !bla bla
    more bla
    some useless data
    ==marker1==
    something interesting
     something interesting
     something interesting
    ==marker2==
    some useless data
    ==marker3==
    something interesting
     something interesting
    ==marker4==
    some useless data
    bla bla

我将使用explode()将生成的有用数据传输到我的数据库。

编辑: 好吧,我这样解决了。

preg_match('/(==marker1==)(.*?)(==marker2==)/s', $input, $marker1to2);
$marker1to2 = trim($marker1to2[2]); 
$marker1to2 = preg_replace('/something /', '==marker1== something ', $marker1to2, 1); 
echo $marker1to2;

1 个答案:

答案 0 :(得分:0)

你需要正则表达式:

$data = "!bla bla
more bla
some useless data
==marker1==
something interesting
 something interesting
 something interesting
==marker2==
some useless data
==marker3==
something interesting
 something interesting
==marker4==
some useless data
bla bla";

preg_match("/(==marker1==)(.*)(==marker2==)/s", $data, $marker1to2);
$marker1to2 = trim($marker1to2[2]);

preg_match("/(==marker3==)(.*)(==marker4==)/s", $data, $marker3to4);
$marker3to4 = trim($marker3to4[2]);

echo "Marker 1 to 2:\n$marker1to2\n\n";
echo "Marker 3 to 4:\n$marker3to4\n\n";

输出:

Marker 1 to 2:
something interesting
something interesting
something interesting

Marker 3 to 4:
something interesting
something interesting