浏览某些html标签集

时间:2012-06-29 10:21:38

标签: php curl

我在这里需要一些帮助。我们要做的是拉出b节点中的所有内容。

<P><B>Credit Weighting: </B>5<BR><BR>
<B>Teaching Period(s): </B>Teaching Periods 1 and 2.<BR><BR>
<B>No. of Students: </B>-.<BR><BR>
<B>Pre-requisite(s): </B>None<BR><BR>
<P><A HREF="#top" class="toppage">[Top of page]</A></P>

<P><B>Credit Weighting: </B>20<BR><BR>
<B>Teaching Period(s): </B>Teaching Periods 1 and 2.<BR><BR>
<B>No. of Students: </B>-.<BR><BR>
<B>Pre-requisite(s): </B>None<BR><BR>
<P><A HREF="#top" class="toppage">[Top of page]</A></P>

<P><B>Credit Weighting: </B>10<BR><BR>
<B>Teaching Period(s): </B>Teaching Periods 1 and 2.<BR><BR>
<B>No. of Students: </B>-.<BR><BR>
<B>Pre-requisite(s): </B>None<BR><BR>
<P><A HREF="#top" class="toppage">[Top of page]</A></P>

我能够从第一组中提取数据。下面是我的示例代码

    // GETTING ALL THE B NODE STUFFS AND PRINTING IT'S CONTENTS
    $result = array();
    foreach($document->getElementsByTagName('b') as $node){
    $result[preg_replace('/:\s+$/','',$node->textContent)] = trim($node->nextSibling->textContent);
    } 
    var_dump($result);
    echo '<br /><br />'; 

现在我要做的是循环使用三组html代码来获取所有b节点并获得上下文。如何才能解决这个问题?

2 个答案:

答案 0 :(得分:0)

尝试

preg_match_all("/\<B\>(.*)\<\/B>([^\<]+)/", $text, $regs);

假设第二位数据中没有html标签。

答案 1 :(得分:0)

你的意思是这样吗?

$result = array();
$id= -1;
foreach($document->getElementsByTagName('b') as $node){
    $field= preg_replace('/:\s+$/','',$node->textContent);
    if ( $field == "Credit Weighting" ) $id++;
    $result[$id][$field]= trim($node->nextSibling->textContent);
}.
var_dump($result);

这会让你:

array(3) {
  [0] =>
  array(4) {
    'Credit Weighting' =>
    string(1) "5"
    'Teaching Period(s)' =>
    string(25) "Teaching Periods 1 and 2."
    'No. of Students' =>
    string(2) "-."
    'Pre-requisite(s)' =>
    string(4) "None"
  }
  [1] =>
  array(4) {
    'Credit Weighting' =>
    string(2) "20"
    'Teaching Period(s)' =>
    string(25) "Teaching Periods 1 and 2."
    'No. of Students' =>
    string(2) "-."
    'Pre-requisite(s)' =>
    string(4) "None"
  }
  [2] =>
  array(4) {
    'Credit Weighting' =>
    string(2) "10"
    'Teaching Period(s)' =>
    string(25) "Teaching Periods 1 and 2."
    'No. of Students' =>
    string(2) "-."
    'Pre-requisite(s)' =>
    string(4) "None"
  }
}