Question

我正试图摆脱<note>标记内的一些词。我有一个很长的字符串

string(4687) "~~PB~~ {{:en:iot-open:remotelab:logotyp_1_.png?200|}} <note>testtest</note> ====== RoofTop Thermo Laboratory - intelligent house and heating management ====== The laboratory is located at nowhere, xxx, xxxxx on the roof of bu...... =＆GT; 转储结果

问题是它不会从testtest代码

之间删除此note字符串

我正在尝试使用我在strip_tags手册中找到的这个函数。

      function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);

  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
}

这是我的完整代码

foreach ($data as $line)
        {
            // Find list tag
            $posi = strpos($line, "* ");

            // No list ?
            if ($posi === false) {
                continue;
            }

            // Check indent
            if (($posi % 2) != 0){
                //echo "<li>Invalid indentation in TOC</li>\n";
            }

            // Calculate indent
            $indent = ($posi - 2) / 2;
            // Search for header
            $posh = strpos($line, "]]");

            // No header ?
            if ($posh === false) {
                continue;
            }
            // Extract file path
            $page_path = substr($line, $posi + 4, $posh - $posi - 4);
            $file_path = str_replace(":", "/", $page_path);
            $file_path = $this->getConf("homelab_datapages_folder").$file_path.".txt";
      $indent2 = 0;


            // Page file exists ?
            if (file_exists($file_path))
            {
                // Open file
                $page_content = htmlspecialchars(file_get_contents($file_path));
        $page_content = $this->strip_tags_content($page_content,'note',TRUE);
        $page_cont = strip_tags(html_entity_decode($page_content));
                // Shorten header
                $book_content .= $this->shorten_header($page_content, $indent, $indent2)."\n";

        var_dump($book_content);
        //$book_content .=
      }
            else
            {
                $book_content .= "---\n MISSING PAGE ---\n";
            }

            // Display page
            //echo "    <li>".$page_path." (".$indent.")</li>\n";
        }

可能是什么问题？

我的字符串是否太长而无法使用preg_replase或我在这里犯了错误？

Answer 1

当你打电话

时

$this->strip_tags_content($page_content,'note',TRUE);

preg_match_all结果为空数组$tags，因此之后的所有测试均为false，返回值为$text，无需任何修改。

调用该函数：

$this->strip_tags_content($page_content,'<note>',TRUE);
//                                       ^____^

Answer 2

我得到了它的工作。

问题在于htmlspecialchars()功能。

$page_content = htmlspecialchars(file_get_contents($file_path));

到

$page_content = file_get_contents($file_path);

无法在大字符串上使用preg_replace

2 个答案: