提取存储在PHP变量中的大型HTML代码块的特定部分

时间:2011-11-08 16:19:52

标签: php dom html

我有一个幻灯片的嵌入代码,如下所示。整个html存储在变量$embed_code中。

我在PHP中打印此代码。现在我想要一个来自这个HTML字符串的代码。

代码如下。我希望代码只在<object>标记之间。

$embed_code = '
 <div style="width:425px" id="__ss_617490"><strong style="display:block;
 margin:12px 0 4px"><a href="http://www.slideshare.net/al.capone/funny-beer-babies-
 enginnering-rev-2-presentation" title="Funny beer babies enginnering rev. 
 2">Funny beer babies enginnering rev. 2</a></strong>


<object id="__sse617490" 
 width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com
/swf/ssplayer2.swf?doc=becoming-an-engineer-1222340701618958-9&stripped_title=funny-  
 beer-babies-enginnering-rev-2-presentation&userName=al.capone" /><param  
 name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/>
 <embed name="__sse617490" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=
  becoming-an-engineer-1222340701618958-9&stripped_title=funny-beer-babies-enginnering-
  rev-2-presentation& userName=al.capone" type="application/x-shockwave-flash" 
   allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed> 
  </object>




 <div style="padding:5px 0  12px">View more<a href="http://www.slideshare.net
  /"> presentations</a> from <a href="http://www.slideshare.net/al.capone">
  al.capone</a>.</div></div>';

现在我只希望来自<object id="....." to "</embed> </object>的这个字符串,这整个HTML是动态生成的,所以请给我任何想法。

我该怎么做?是否有任何PHP函数可以在任何标记之间提取html?

3 个答案:

答案 0 :(得分:1)

我喜欢使用PHPQuery来解析和使用PHP从HTML中提取数据。它使用jQuerys简单的CSS样式选择器来遍历代码。

所以它会是:

require('phpQuery/phpQuery.php');
$doc = phpQuery::newDocumentHTML($embed_code);
$div = pq('div#__ss_617490'); // select a DIV with the specified ID
var_dump($div->attr('style')); //To get the style attribute
var_dump($div->html()); // To get the inner html

// now to get the object tag like you desire.
$object_tag = pq('object');

// only get the first object
$object_tag = pq('object:first');

答案 1 :(得分:1)

使用DOMDocument类。

$dom = new DomdDocument ();
$dom -> loadHtml ($embed_code);
$htmlObject = $dom -> getElementById ('__sse617490'); // Returns a DomElement

http://www.php.net/dom

答案 2 :(得分:-1)

你可以使用正则表达式来解析和提取它:

$embed_code = "blah blah <object ...>and other code here</object> blah blah";

$matches = array();
preg_match('#<object(\s*[^>])?>(.*)</object>#iU', $embed_code, $matches);

// $matches[0] = "<object ...>and other code here</object>"
// $matches[1] = "and other code here"