PHP preg_match_all,读取内容并排除不需要的内容

时间:2012-04-21 07:48:18

标签: php

我有一个我想要阅读的文本文件,但是在开头排除包含某些字符的行(因此是“@”,或者稍后定义的任何字符):

@ I don't want this line to be read
This line should be read;
"This one" should be read, too;
'Also this one' should be read;
...etc
@ But this one should be ignored;

使用下面的代码,我可以爆炸以分号(“;”)结尾的代码,但最后一行不应该,因为它以“@”开头。

$contents = file_get_contents($the_path);
$result = array_map('trim', explode(";", $contents));

有任何暗示可以实现这一目标吗?感谢

更新代码:

// http://stackoverflow.com/questions/10257244/php-preg-match-all-read-content-and-exclude-unwanted/10257319
  $results = array();
  $matches = array();
  $the_path = '/path/to/file.txt';
  if (is_file($the_path)) {
    $contents = file_get_contents($the_path);
    if ($contents) {
      // ! array warning
      // $contents = array_map('rtrim', $contents);
      // $matches = preg_grep('#^@#', $contents, PREG_GREP_INVERT);
      $matches = preg_split("/[\r\n]/", preg_replace("/@.*?[\r\n]/", "", $contents), NULL, PREG_SPLIT_NO_EMPTY);

      if ($matches) {
        foreach ($matches as $key => $val) {
          $results[$key] = $val;
        }
      }
    }
  }
  // Attempt to remove the first 0 key, and start from 1, because 0|value0 is considered NULL
  $results = array_combine(range(1, count($results)), array_values($results));

  return !empty($results) ? $results : array();

更新2,通过DCoder正常工作:

  $matches = array();
  if ($contents = file($the_path)) {
      $contents = array_map('rtrim', $contents);
      $keyword = '@';
      // Still output @line
      // $matches = preg_grep('#^@#', $contents, PREG_GREP_INVERT); 
      // Ok, thanks to http://php.net/manual/de/function.preg-grep.php#85503
      $matches = preg_grep("/{$keyword}/i", $contents, PREG_GREP_INVERT);         

      // $matches = preg_split("/[\r\n]/", preg_replace("/@.*?[\r\n]/", "", $contents), NULL, PREG_SPLIT_NO_EMPTY);
      // dsm($matches);
      if ($matches) {
        foreach ($matches as $key => $match) {
         $results[$key] = $match;
        }
      }
  }


  // $results = array_combine(range(1, count($results)), array_values($results));
  return $results;

2 个答案:

答案 0 :(得分:1)

// get the contents of the file as an array of lines
$contents = file($the_path);
if($contents === false) {
    throw new Exception("Failed to open file {$the_path}");
}
// drop ending newlines
$contents = array_map('rtrim', $contents);

// find all lines except those starting with @
$matched = preg_grep('#^@#', $contents, PREG_GREP_INVERT);

答案 1 :(得分:1)

使用此代码,$ lines将包含一个数组,其中所有行都不以@开头

$contents = file_get_contents($the_path);
$lines = preg_split("/[\r\n]/", preg_replace("/@.*?[\r\n]/", "", $contents), null, PREG_SPLIT_NO_EMPTY);