Question

我想使用php preg_match_all（）从表中提取一些数据。我有html如下，我想获取td中的值，产品代码：RC063154016。我怎样才能做到这一点？我没有任何正则表达式的经验，

  <table width="100%" border="0" cellspacing="0" cellpadding="0">
      <tbody>
        <tr>
          <td><span>Product code:</span> RC063154016</td>                   
          <td><span>Gender:</span> Female</td>
        </tr>
      </tbody>
    </table>

Answer 1

使用DomDocument

$str = <<<STR
<table width="100%" border="0" cellspacing="0" cellpadding="0">
      <tbody>
        <tr>
          <td><span>Product code:</span> RC063154016</td>                   
          <td><span>Gender:</span> Female</td>
        </tr>
      </tbody>
    </table>
STR;

$dom = new DOMDocument();
@$dom->loadHTML($str);
$tds = $dom->getElementsByTagName('td');
foreach($tds as $td){
  echo $td->nodeValue . '<br>';
}

输出

Product code: RC063154016
Gender: Female

Answer 2

这应该适合你：

preg_match_all('|<td><span>Product code:</span>([^<]*)</td>|', $html, $match);

但是如果你认为标签周围可以有随机的空格，那么这个：

preg_match_all('|<td>\s*<span>\s*Product code:\s*</span>([^<]*)</td>|', $html, $match);

Answer 3

$data = <<<HTML
  <table width="100%" border="0" cellspacing="0" cellpadding="0">
      <tbody>
        <tr>
          <td><span>Product code:</span> RC063154016</td>
          <td><span>Gender:</span> Female</td>
        </tr>
      </tbody>
    </table>
HTML;


if(preg_match_all('#<td>\s*<span>Product code:</span>\s*([^<]*)</td>#i', $data, $matches)) {
    print_r($matches);
}

Answer 4

使用任何一个解析器并解析HTML并使用它。不要在这里使用preg *函数。请阅读此回答How do you parse and process HTML/XML in PHP?

使用regex php从表中获取数据

4 个答案:

输出