在PHP中解析包含博客备份数据的文本文件

时间:2016-07-20 11:03:59

标签: php arrays parsing text file-handling

我有一个包含博客备份数据的.txt文件。数据格式如下:

AUTHOR: A1
TITLE: title1…
STATUS: Publish
ALLOW COMMENTS: 1
CONVERT BREAKS: default
ALLOW PINGS: 0
PRIMARY CATEGORY: sample
CATEGORY: sample 2

DATE: 11/18/2010 09:36:00
-----
BODY:
Lorem Ipsum is simply dummy text of the printing and typesetti industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
-----
EXTENDED BODY:

-----
EXCERPT:

-----
KEYWORDS:
Key1, Key2, key3
-----

我希望将上述内容转换为键值对数组。 我在访问BODY字段或下一行(如KEYWORDS)等多行值时遇到问题。

$file_handle = fopen("show.txt", "r");
while (!feof($file_handle)) {
    $line_of_text = fgets($file_handle);
    $parts = array_map('trim', explode(':', $line_of_text, 2));

    //storing data in to arrray
    $result[$parts[0]] = isset($parts[1]) ? $parts[1] : ""; 
}

1 个答案:

答案 0 :(得分:0)

也许您可能会发现这很有用 - 在尝试通过删除垃圾字符,换行符等将数据转换为数组之前准备数据是有意义的

/* modify to suit path to your file */
$file='c:/temp2/blogdata.txt';
if( realpath($file) ){

    /* prepare the text by removing certain character sequences */
    $text=str_replace('-','',preg_replace('@:\r\n@',':',file_get_contents($file,FILE_TEXT)));

    /* create an array from the various lines */
    $a=explode( '|', preg_replace('@\r\n@', '|', $text));

    /* something to store the results in */
    $output=array();

    /* iterate through the array of data and add param/value to output */
    foreach($a as $pair){
        list($param,$value)=explode(':',$pair);
        if(!empty($param) &&!empty($value))$output[$param]=$value;
    }

    /* debug output */
    echo '<pre>',print_r($output,true),'</pre>';    
}

产生以下输出:

Array
(
    [AUTHOR] =>  A1
    [TITLE] =>  title1…
    [STATUS] =>  Publish
    [ALLOW COMMENTS] =>  1
    [CONVERT BREAKS] =>  default
    [ALLOW PINGS] =>  0
    [PRIMARY CATEGORY] =>  sample
    [CATEGORY] =>  sample 2
    [DATE] =>  11/18/2010 09
    [BODY] => Lorem Ipsum is simply dummy text of the printing and typesetti industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
    [KEYWORDS] => Key1, Key2, key3
)