来自短语:
<div class="latestf"> <a href="http://www.x.ro/anamaria/"
rel="nofollow"
我想提取anamaria。如何用preg_match_all做到这一点?
我试过了:
preg_match_all("'<div class=\"latestf\">
<a href=\"http://www.x.ro/(.*?)\" rel=\"nofollow\"'si", $source, $match);
但它没有用......
提前谢谢!
答案 0 :(得分:1)
试试这个:
$source = '<div class="latestf"> <a href="http://www.x.ro/anamaria/" rel="nofollow"';
preg_match_all('#<div\s*class="latestf">\s*<a\s*href="http://www\.x\.ro/(.*?)/?"\s*rel="nofollow"#i', $source, $match);
print_r($match);
Array
(
[0] => Array
(
[0] => <div class="latestf"> <a href="http://www.x.ro/anamaria/" rel="nofollow"
)
[1] => Array
(
[0] => anamaria
)
)
答案 1 :(得分:1)
不要尝试使用正则表达式解析HTML。改为使用DOM parser:
$html = '<div class="latestf"> <a href="http://www.x.ro/anamaria/"
rel="nofollow"';
$dom = new DOMDocument;
@$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('a') as $node)
{
$link = $node->getAttribute("href");
}
$parsed = parse_url($link);
echo substr($parsed['path'], 1, -1);
输出:
anamaria
答案 2 :(得分:0)
/
应该像\/
<?php
$source = '<div class="latestf"> <a href="http://www.x.ro/anamaria/" rel="nofollow"';
preg_match_all('/<div class="latestf"> <a href="http:\/\/www.x.ro\/(.*?)\/" rel="nofollow"/', $source, $match);
var_dump($match);exit;