如何从给定的Text中将信息提取到数组中?

时间:2015-08-03 15:26:06

标签: php arrays extract

考虑表格

的文字
@article{refregier2003weak,   
title={Weak gravitational lensing by large-scale structure},   
author={Refregier, Alexandre},  
journal={Annual Review of Astronomy and Astrophysics},  
volume={41}, 
pages={645},   
year={2003},   
publisher={Annual Reviews, Inc.} 
}

其中" refregier2003weak"是文章的ID。标签,标题,作者,期刊......的顺序可能会从一篇文章变为另一篇文章,在某些情况下,某些文章中可能会缺少某些标签。

如何使用PHP提取这些标签的值和数组中文章的ID?

3 个答案:

答案 0 :(得分:0)

这可以使用简单的子字符串方法完成。例如:

$parsed = array();
foreach (explode("\n", $text) as $line) { // Split by line
    list ($key, $value) = explode('{', $line, 2); // Split by the first { ($key is text to the left, $val is text to the right)
    $value = substr($value, 0, strrpos($value, '}')); // Strip off everything after the last }
    switch ($key) {
        case '@article': $parsed['articleId'] = $value; break;
        case 'title={': $parsed['title'] = $value; break;
        case 'author={': // ...
    }
}

var_dump($parsed);

答案 1 :(得分:0)

这是一个快速解决方案:

<?php

$input = '@article{refregier2003weak,   
title={Weak gravitational lensing by large-scale structure},   
author={Refregier, Alexandre},  
journal={Annual Review of Astronomy and Astrophysics},  
volume={41}, 
pages={645},   
year={2003},   
publisher={Annual Reviews, Inc.} 
}';
$pattern = '@(\w+)=\{(.*)\}@';
$articleIdPattern = '|@article\{(.*?),|';
preg_match_all($pattern, $input, $matches);
preg_match_all($articleIdPattern, $input, $articleMatches);

$result = [];
if (isset($matches[1]) && isset($matches[2])) {
    foreach ($matches[1] as $key => $value) {
        if (isset ($matches[2][$key])) {
            $result[$value] = $matches[2][$key];
        }
    }
}
if (isset($articleMatches[1][0])) {
    $result['article'] = $articleMatches[1][0];
}

var_dump($result);

http://ideone.com/K0WPVg

答案 2 :(得分:0)

只要你打算一次解析一个就行了。

请记住为输入字符串更改$ inputString的名称

<?php  
preg_match_all('/@article\{(.*?),/', $inputString, $id);
preg_match_all('/(.+)={(.*?)}/', $inputString, $meta);
$arrayResults = array("id" => $id[1][0]);
foreach($meta[1] as $key => $someMeta){
    $arrayResults[$someMeta] = $meta[2][$key];
}
var_dump($arrayResults);
?>

产生这个:

Array
(
    [id] => refregier2003weak
    [title] => Weak gravitational lensing by large-scale structure
    [author] => Refregier, Alexandre
    [journal] => Annual Review of Astronomy and Astrophysics
    [volume] => 41
    [pages] => 645
    [year] => 2003
    [publisher] => Annual Reviews, Inc.
)