正则表达式匹配2个文本文件

时间:2014-02-13 09:43:56

标签: php regex file

'1.txt'包含几行网址。 '2.txt'包含几行根域。

我想在'2.txt'中搜索每个根域,如果它存在于'1.txt'中,并打印出'1.txt'中的整行,如果匹配的话。

例如。

'的1.txt'

http://somesubdomain.rootdomain1.com/242348788/very-long-text/
http://rootdomain1.com/etetdfgret/
http://rootdomain2.com/value?somevalue
http://rootdomain1.com/value?somevalue
http://rootdomain3.com/value?somevalue2

'2.txt'

rootdomain1.com

应该返回

http://somesubdomain.rootdomain1.com/242348788/very-long-text/
http://rootdomain1.com/etetdfgret/
http://rootdomain1.com/value?somevalue

我得到了它,但是如果'2.txt'包含几行,它只会得到与'1.txt'相匹配的最后一行。从上面的示例if:

'2.txt'

rootdomain1.com
rootdomain4.com

它不会返回任何东西。 (这意味着最后一行rootdomain4.com不匹配)

这是我到目前为止所拥有的。

  <?php
  $file = '1.txt';

  // get the file contents, assuming the file to be readable (and exist)
  $contents = file_get_contents($file);

  $lines = file('2.txt');

  ?>

  <div class="datagrid">
    <table>
      <tbody>
        <?php
        foreach($lines as $line) {
          // escape special characters in the query
          $pattern = preg_quote($line, '/');
          // finalise the regular expression, matching the whole line
          $pattern = "/^.*$pattern.*\$/m";
          // search, and store all matching occurences in $matches
          preg_match_all($pattern, $contents, $matches, PREG_SET_ORDER);
          foreach ($matches as $val) {
            echo "<tr><td>".preg_replace('/\s+/', '', $val[0])."</td></tr>";
          }
        }
        ?>
      </tbody>
      <tfoot>
        <tr>
          <td>
            <div id="paging"><strong><?php echo "TOTAL: ".count($matches); ?></strong></div>
          </td>
        </tr>
      </tfoot>
    </table>
  </div>

2 个答案:

答案 0 :(得分:1)

至少有一个问题是你逃过了$

$pattern = "/^.*$pattern.*\$/m";

删除反斜杠。

答案 1 :(得分:0)

尝试从您的模式中删除可能的换行符: 同样如上所述,删除$

前面的反斜杠
foreach($lines as $line) {
    // escape special characters in the query
    $pattern = preg_quote($line, '/');

    // remove new lines
    $pattern = str_replace("\n", "", $pattern);

    // finalise the regular expression, matching the whole line
    $pattern = "/^.*$pattern.*$/m";

    // search, and store all matching occurences in $matches
    preg_match_all($pattern, $contents, $matches, PREG_SET_ORDER);
    foreach ($matches as $val) {
        echo "<tr><td>".preg_replace('/\s+/', '', $val[0])."</td></tr>";
    }
}

或者使用FILE_IGNORE_NEW_LINES标志作为文件函数