在php中使用正则表达式的不同div标签

时间:2016-05-10 06:11:31

标签: php

以html格式:

<div class="student_information">
        <tr>
           <div class="admin"><td>141201A</td></div>
            <div class="name"><td>Sally</td></div>
            <div class="hp"><td>83556112</td></div>
            <div class="email"><td>141201A@gmail.com</td></div>
        </tr>

Output I wanted: 
141201A
Sally Tan
83556112
141201A@gmail.com

我想获得div和td之间的内容。我确实理解还有其他选择,比如xpath和DOM Document会更合适,但是我的项目要求我们在php中使用正则表达式,否则会影响项目的后期部分,因此需要任何帮助。谢谢

4 个答案:

答案 0 :(得分:0)

使用php strip_tags()

<?php
$str = '<div class="student_information">
        <tr>
           <div class="admin"><td>141201A</td></div>
            <div class="name"><td>Sally</td></div>
            <div class="hp"><td>83556112</td></div>
            <div class="email"><td>141201A@gmail.com</td></div>
        </tr>';
$strpped = strip_tags ($str);
$strpped = trim(preg_replace('/\s\s+/', ' ', str_replace("\n", " ", $strpped)));
echo $strpped;
?>

输出:

141201A
Sally
83556112
141201A@gmail.com

答案 1 :(得分:0)

使用preg_replace()

<?php
$str = '<div class="student_information">
        <tr>
           <div class="admin"><td>141201A</td></div>
            <div class="name"><td>Sally</td></div>
            <div class="hp"><td>83556112</td></div>
            <div class="email"><td>141201A@gmail.com</td></div>
        </tr>';
$content = preg_replace('/<[^>]*>/', '', $str);
$content = trim(preg_replace('/\s\s+/', ' ', str_replace("\n", " ", $content)));
$arr = explode(' ', $content);
if (! empty($arr)) {
 foreach ($arr as $elem) {
  echo $elem . "<br/>";
 }
}
?>

输出:

141201A
Sally
83556112
141201A@gmail.com

答案 2 :(得分:0)

要删除html代码,请使用php strip_tags()

答案 3 :(得分:0)

您可以使用preg_match_all(因为它使用正则表达式),如下所示:

<?php
$str = '<div class="student_information">
        <tr>
            <div class="admin"><td>141201A</td></div>
            <div class="name"><td>Sally</td></div>
            <div class="hp"><td>83556112</td></div>
            <div class="email"><td>141201A@gmail.com</td></div>
        </tr>';
preg_match_all('/<td>[a-zA-Z0-9.@]+<\/td>/', $str, $matches);
print_r($matches[0]);

结果:

Array
(
    [0] => 141201A
    [1] => Sally
    [2] => 83556112
    [3] => 141201A@gmail.com
)