获取标签之间的内容包括内部标签

时间:2014-05-25 07:55:16

标签: php html

我有内容:

<html>
<body>
    <div class="another div">
        other content
    </div>
    <div class="fck_detail width_common">
        <p class="Normal">
            Some text 1.
        </p>
        <p class="Normal">
            Some text 2.
        </p>
        <div style="text-align:center;">
            <div class="embed-container">
                <div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
            </div>
        </div>
        <p class="Normal">
            Some text 3.
        </p>
        <p class="Normal">
            Some text 4.
        </p>
    </div>
</body>
</html>

我使用下面的函数来获取&#39; div class =&#34; fck_detail width_common&#34;&#39;

的内容
function get_content_by_tag($content, $tag_and_more, $include_tag = true){
        $p = stripos($content,$tag_and_more,0);
        if($p==false) return "";
        $content=substr($content,$p);
        $p = stripos($content," ",0);
        if(abs($p)==0) return "";
        $open_tag = substr($content,0,$p);
        $close_tag = substr($open_tag,0,1)."/".substr($open_tag,1).">";

        $count_inner_tag = 0;
        $p_open_inner_tag = 1; 
        $p_close_inner_tag = 0;
        $count=1;
        do{
            $p_open_inner_tag = stripos($content,$open_tag,$p_open_inner_tag);
            $p_close_inner_tag = stripos($content,$close_tag,$p_close_inner_tag);
            $count++;
            if($p_close_inner_tag!=false) $p = $p_close_inner_tag;
            if($p_open_inner_tag!=false){
                if(abs($p_open_inner_tag)<abs($p_close_inner_tag)){
                    $count_inner_tag++;
                    $p_open_inner_tag++;
                }else{
                    $count_inner_tag--;
                    $p_close_inner_tag++;
                }
            }else{
                $count_inner_tag--;
                if($p_close_inner_tag>0) $p_close_inner_tag++;
            }
        }while($count_inner_tag>0);
        if($include_tag)
            return substr($content,0,$p+strlen($close_tag));
        else{
            $content = substr($content,0,$p);
            $p = stripos($content,">",0);
            return substr($content,$p+1);
        }
    }

然后我尝试

echo get_content_by_tag($content, '<div class="fck_detail width_common">');

它只返回:

<div class="fck_detail width_common">
    <p class="Normal">
        Some text 1.
    </p>
    <p class="Normal">
        Some text 2.
    </p>
    <div style="text-align:center;">
        <div class="embed-container">
            <div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
        </div>
    </div>

错过了DIV的内容&#34;一些文字3&#34;和#34;一些文字4&#34;

谁能告诉我什么是错的?

2 个答案:

答案 0 :(得分:1)

执行此操作的一种方法是通过PHP Simple HTML DOM Parser

$str = '
<html>
<body>
    <div class="another div">
        other content
    </div>
    <div class="fck_detail width_common">
        <p class="Normal">
            Some text 1.
        </p>
        <p class="Normal">
            Some text 2.
        </p>
        <div style="text-align:center;">
            <div class="embed-container">
                <div id="video-18574" data-component="true" data-component-    type="video" data-component-value="18574" data-component-typevideo="2"></div>
        </div>
    </div>
    <p class="Normal">
        Some text 3.
    </p>
    <p class="Normal">
        Some text 4.
    </p>
</div>
</body>
</html> 
';

$html = str_get_html($str);
echo $html->find("div[class='fck_detail width_common']",0)->innertext;

答案 1 :(得分:0)

尝试使用此库:http://sourceforge.net/projects/simplehtmldom/

您可以通过以下方式获取数据

$url="www.yoururl.html";
$html = new simple_html_dom();
$html = file_get_html($url);
$data = $html->find('.fck_detail',0);

$html = str_get_html($str);
$data = $html->find("div[class='fck_detail width_common']",0)->innertext;