我从维基百科APi收到字符串,如下所示:
{{Wikibooks|Wikijunior:Countries A-Z|France}} {{Sister project links|France}} * [http://www.bbc.co.uk/news/world-europe-17298730 France] from the [[BBC News]] * [http://ucblibraries.colorado.edu/govpubs/for/france.htm France] at ''UCB Libraries GovPubs'' *{{dmoz|Regional/Europe/France}} * [http://www.britannica.com/EBchecked/topic/215768/France France] ''Encyclopædia Britannica'' entry * [http://europa.eu/about-eu/countries/member-countries/france/index_en.htm France] at the [[European Union|EU]] *{{Wikiatlas|France}} *{{osmrelation-inline|1403916}} * [http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=FR Key Development Forecasts for France] from [[International Futures]] ;Economy *{{INSEE|National Institute of Statistics and Economic Studies}} * [http://stats.oecd.org/Index.aspx?QueryId=14594 OECD France statistics]
我必须同时使用实际的网址和网址的说明。例如,对于 来自[[BBC新闻]]的[http://www.bbc.co.uk/news/world-europe-17298730法国] 我需要" http://www.bbc.co.uk/news/world-europe-17298730"以及[[BBC新闻]]""法国]但没有括号,就像BBC新闻"中的法国一样。
通过执行以下操作,我设法获得了第一部分:
if(preg_match_all('/\[http(.*?)\s/',$result,$extmatch)) {
$mt= str_replace("[[","",$extmatch[1]);
但是我不知道如何绕过第二部分(不幸的是,我在正则表达式上非常弱:-()。
有什么想法吗?
答案 0 :(得分:1)
<强> PHP:强>
$input = "{{Wikibooks|Wikijunior:Countries A-Z|France}} {{Sister project links|France}} * [http://www.bbc.co.uk/news/world-europe-17298730 France] from the [[BBC News]] * [http://ucblibraries.colorado.edu/govpubs/for/france.htm France] at ''UCB Libraries GovPubs'' *{{dmoz|Regional/Europe/France}} * [http://www.britannica.com/EBchecked/topic/215768/France France] ''Encyclopædia Britannica'' entry * [http://europa.eu/about-eu/countries/member-countries/france/index_en.htm France] at the [[European Union|EU]] *{{Wikiatlas|France}} *{{osmrelation-inline|1403916}} * [http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=FR Key Development Forecasts for France] from [[International Futures]] ;Economy *{{INSEE|National Institute of Statistics and Economic Studies}} * [http://stats.oecd.org/Index.aspx?QueryId=14594 OECD France statistics]";
$regex = '/\[(http\S+)\s+([^\]]+)\](?:\s+from(?:\s+the)?\s+\[\[(.*?)\]\])?/';
preg_match_all($regex, $input, $matches, PREG_SET_ORDER);
var_dump($matches);
<强>输出:强>
array(6) {
[0]=>
array(4) {
[0]=>
string(78) "[http://www.bbc.co.uk/news/world-europe-17298730 France] from the [[BBC News]]"
[1]=>
string(47) "http://www.bbc.co.uk/news/world-europe-17298730"
[2]=>
string(6) "France"
[3]=>
string(8) "BBC News"
}
...
...
...
...
...
}
<强>解释强>
\[ (?# match [ literally)
( (?# start capture group)
http (?# match http literally)
\S+ (?# match 1+ non-whitespace characters)
) (?# end capture group)
\s+ (?# match 1+ whitespace characters)
( (?# start capture group)
[^\]]+ (?# match 1+ non-] characters)
) (?# end capture group)
\] (?# match ] literally)
(?: (?# start non-capturing group)
\s+ (?# match 1+ whitespace characters)
from (?# match from literally)
(?: (?# start non-capturing group)
\s+ (?# match 1+ whitespace characters)
the (?# match the literally)
)? (?# end optional non-capturing group)
\s+ (?# match 1+ whitespace characters)
\[\[ (?# match [[ literally)
( (?# start capturing group)
.*? (?# lazily match 0+ characters)
) (?# end capturing group)
\]\] (?# match ]] literally)
)? (?# end optional non-caputring group)
如果您需要更全面的解释,请告诉我,但我上面的评论应该有所帮助。如果您有任何具体问题,我非常乐意提供帮助。下面的链接将帮助您可视化表达式正在做什么。
答案 1 :(得分:1)
不使用正则表达式的解决方案:
代码:
$parts=explode('*',$str);
$links=array();
foreach($parts as $k=>$v){
$parts[$k]=ltrim($v);
if(substr($parts[$k],0,1)!=='['){
unset($parts[$k]);
continue;
}
$parts[$k]=preg_replace('/\[|\]/','',$parts[$k]);
$subparts=explode(' ',$parts[$k]);
$links[$k][0]=$subparts[0];
unset($subparts[0]);
$links[$k][1]=implode(' ',$subparts);
}
echo '<pre>'.print_r($links,true).'</pre>';
结果:
Array
(
[1] => Array
(
[0] => http://www.bbc.co.uk/news/world-europe-17298730
[1] => France from the BBC News
)
[2] => Array
(
[0] => http://ucblibraries.colorado.edu/govpubs/for/france.htm
[1] => France at ''UCB Libraries GovPubs''
)
[4] => Array
(
[0] => http://www.britannica.com/EBchecked/topic/215768/France
[1] => France ''Encyclopædia Britannica'' entry
)
[5] => Array
(
[0] => http://europa.eu/about-eu/countries/member-countries/france/index_en.htm
[1] => France at the European Union|EU
)
[8] => Array
(
[0] => http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=FR
[1] => Key Development Forecasts for France from International Futures ;Economy
)
[10] => Array
(
[0] => http://stats.oecd.org/Index.aspx?QueryId=14594
[1] => OECD France statistics
)
)