我正在为文本数据编写某种解析器。我无法更改源文本,它看起来像这样:
27 may 15:28 Id: 42 #1 Random Text
Info: 3 Locatin: Street Guests: 2
(Text header 1) Apple 15
(Text header 2) Milk 2
(Text header 1) Ice cream 4
(Text header 3) Pencil 1
(Text header 1) Box 1
(Text header 2) Cardboard x1
(Text header 3) White x1
(Text header 1) Cube x1
(Text header 1) Phone 1
(Text header 1) Specific text x1
(Text header 1) Symbian x1
所以我试图获取如下所示的csv文件:
42 ; 15:28
Apple ; 15 ; NOHANDLE ; NOHANDLE
Milk ; 2 ; NOHANDLE ; NOHANDLE
Ice cream ; 4 ; NOHANDLE ; NOHANDLE
Pencil ; 1 ; NOHANDLE ; NOHANDLE
Box ; 1 ; Cardboard, White, Cube ; NOHANDLE
Phone ; 1 ; Symbian ; Specific text
我不知道如何解决这个问题...我只是卡住了...
<?php
// Set path to file
$file_input = fopen('C:\work\input_utf8_0.txt','r') or die('cant open file');
// Lines starts with 3 spaces
$childLine = ' ';
while($line = fgets($file_input)) {
if...
// I have no idea...
echo fgets($file_input). "/n";
// Yeah it's just lines as they are... :-(
}
fclose($file);
?>
我正在尝试正则表达式:
<?php
$re = '/^(?<child> )?(?>\(.*\) )(?<text>\w+(?> \w+)*)\ +x?(?<count>[0-9]+)$/m';
$str = '(Text header 1) Apple 15
(Text header 2) Milk 2
(Text header 1) Ice cream 4
(Text header 3) Pencil 1
(Text header 1) Box 1
(Text header 2) Cardboard x1
(Text header 3) White x1
(Text header 1) Cube x1
(Text header 1) Phone 1
(Text header 1) Specific text x1
(Text header 1) Symbian x1';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
?>
这里:https://regex101.com/r/7UAWAV/2-看起来正则表达式非常适合。它会找到所有匹配项...但是实际上,上面的代码仅显示一个数组-最后一行。
我认为我应该使用某种循环来匹配每行并合并数组或其他内容。