如何从<span class =“?”解析href值及其名称

时间:2015-12-25 15:18:25

标签: php dom html-parsing preg-match-all

=“”

我的html string($str)中包含span中的链接及其标题/名称。我想提取view.php?id=123,view.php?id=124和他们的名字galaxy和galaxy2。

任何人都可以帮我提取链接及其内部名称吗?我试过以下但我没有数据!提前感谢。

$str="...............<span class="title" ><a href="view.php?id=123" class="title"><strong>galaxy</strong></a></span>............<span class="title" style=background:#000000><a href="watch.php?id=124" class="title"><strong>galaxy2</strong></a></span>";

if(preg_match_all('/\<span class="title" ><a href=(.*?)\<\/strong>/',$str,$match)) 
{             
    echo "<br>href:".$match[1][0];
    echo "<br>";
    echo "title:"
}

str样本数据:

<div class="profile cleaning" id="contentlist">
<div class="profile-item ">
<div class="img" data-preview="view.php?id=123">
<img src="./logos/123.jpg" width="240" height="140" alt="">
</div>
<span class="title" style=background:#000000><a href="view.php?id=123" class="title"><strong>Galaxy 1</strong></a></span>
</div><div class="profile-item ">
<div class="img" data-preview="view.php?id=124">
<img src="./logos/124.jpg" width="240" height="140" alt="">
</div>
<span class="title" style=background:#000000><a href="view.php?id=124" class="title"><strong>Galaxy 2</strong></a></span>
</div><div class="profile-item ">
<div class="img" data-preview="view.php?id=125">
<img src="./logos/125.png" width="240" height="140" alt="">
</div>
<span class="title" style=background:#000000><a href="view.php?id=125" class="title"><strong>Galaxy 3</strong></a></span>
</div><div class="profile-item " style="background:#000000;border:1px solid #326EE0;">
<div class="img" data-preview="view.php?id=126">

<div style="position: relative; left: 0; top: 0;vertical-align:top">
<img src="./logos/126.png" style="border: none;padding:1px;border:2px solid #326EE0;margin:0px;margin-bottom:2px;width:240px;position: relative; top: 0; left: 0; " >
<img src="images/mango.png" style="width:240px;position: absolute; top: 0px; left: 0px;"/>
</div>

</div>
<span class="title" ><a href="view.php?id=126" class="title"><strong>Galaxy 4</strong></a></span>
</div><div class="profile-item " style="background:#000000;border:1px solid #326EE0;">
<div class="img" data-preview="view.php?id=127">

<div style="position: relative; left: 0; top: 0;vertical-align:top">
<img src="./logos/127.jpg" style="border: none;padding:1px;border:2px solid #326EE0;margin:0px;margin-bottom:2px;width:240px;position: relative; top: 0; left: 0; " >
<img src="images/mango.png" style="width:240px;position: absolute; top: 0px; left: 0px;"/>
</div>

</div>
<span class="title" ><a href="view.php?id=127" class="title"><strong>Galaxy 5</strong></a></span>
</div><div class="profile-item " style="background:#000000;border:1px solid #326EE0;">
<div class="img" data-preview="view.php?id=128">

<div style="position: relative; left: 0; top: 0;vertical-align:top">
<img src="./logos/128.jpg" style="border: none;padding:1px;border:2px solid #326EE0;margin:0px;margin-bottom:2px;width:240px;position: relative; top: 0; left: 0; " >
<img src="images/mango.png" style="width:240px;position: absolute; top: 0px; left: 0px;"/>
</div>

</div>
<span class="title" ><a href="view.php?id=128" class="title"><strong>Galaxy 6</strong></a></span>
</div></div>

2 个答案:

答案 0 :(得分:1)

你可以使用一个功能。

$str='zxcvbnm<a href="http://www.example.com">zxcv</a>qwertyuiop<span class="title" ><a href="view.php?id=123" class="title"><strong>galaxy</strong></a></span>asdfghjkl<span class="title" style=background:#000000><a href="watch.php?id=124" class="title"><strong>galaxy2</strong></a></span>';

function parse_hrefANDname($str) {
    if (strpos($str, '<span class="title"') === false) return false;

    $line = substr($line, strpos($line, '<a href=')+8);

    $res = array();

    $str_arr = explode('<a href=', $str);
    foreach ($str_arr as $k => $line) {
        if ($k == 0) continue;

        $href_quote = substr($line, 0, 1); // some writes href="", some href=''
        $href_val = substr($line, 1);
        $href_val = substr($href_val, 0, strpos($href_val, $href_quote));

        $name = substr($line, strpos($line, '<strong>') + 8);
        $name = substr($name, 0, strpos($name, '</strong>'));

        $res[$k - 1]['href'] = $href_val;
        $res[$k - 1]['name'] = $name;
    }

    return $res;
}

$arr = parse_hrefANDname($str);
print_r($arr);

答案 1 :(得分:0)

您可以使用SimpleXML来执行此操作。可以通过类似数组的语法访问元素和属性,最好不要为此目的使用某些正则表达式:

$str = '<container><span class="title" ><a href="view.php?id=123" class="title"><strong>galaxy</strong></a></span></container>';
$xml = simplexml_load_string($str);
echo $xml->span->a["href"]; // view.php?id=123
echo $xml->span->a->strong; // galaxy

因此,对于您的情况(具有多个跨度):

<?php
$str='<container>
    <span class="title">
        <a href="view.php?id=123" class="title"><strong>galaxy</strong></a>
    </span>
    <span class="title" style="background:#000000">
        <a href="watch.php?id=124" class="title"><strong>galaxy2</strong></a>
    </span>
 </container>';
$xml = simplexml_load_string($str);
foreach ($xml->span as $span) {
    echo "Link: " . $span->a["href"] . "<br/>";
    echo "Content: " . $span->a->strong->__toString();
} 
?>

提示:我设置了container代码,在您的情况下可能是htmlxml。另外,我必须更正标记(添加双引号)。