Question

所以我正在使用存储在我们数据库中的一些非常棒的HTML字符串，我需要能够解析“论坛式”youtube标签之间的字符串，如下例所示。我有一个解决方案，但感觉有点hackish。我认为可能有一种更优雅的方式来处理这个问题。

<?php

    $video_string = '<p><span style="font-size: 12px;"><span style="font-family: verdana,geneva,sans-serif;">[youtube]KbI_7IHAsyw[/youtube]<br /></span></span></p>';

    $matches = array();
    preg_match('/\][_A-Za-z0-9]+\[/', $video_string, $matches);

    $yt_vid_key = substr($matches[0], 1, strlen($matches[0]) - 2 );

Answer 1

我会改变正则表达式：

    '/\[youtube\](.*?)\[\/youtube\]/is'

添加'youtube'部分以不替换所有bb代码 - 仅替换正确的代码。我还加了'？'使正则表达式不那么贪婪（如果一个帖子中有多个YT视频。我添加了模式修饰符i和s，以便能够匹配不区分大小写和多行的字符串。

编辑：你可能也想要使用preg_replace，这样代码就会少一些。

Answer 2

试试这个：

 preg_match('!\[youtube\]([_A-Za-z0-9]+?)\[/youtube\]!',$subject, $matches);

 $yt_vid_key = $matches[1];

如果您预计会出现多次，请改用preg_match_all。

Answer 3

如果您不期望嵌套标签，那么此处提供的所有答案都是正确的，那么您必须想出一种方法来正确匹配标签，这在正则表达式中无法真正完成，您将不得不创造某种方式来处理它。

这里有一些类似伪代码可以帮助你

find opening tag to tag match

openTags = 0
closeTags = 0
position = 0

do{
    Move through the string: increase position
    if open tag matches: openTags++
    if close tag matches: closeTags++, positionOfCloseTag = position
}while(openTags > closeTags);

first occurence of close tag after the last close tag you found in do-while loop is the correct matching of the tag.

在一串随机HTML中解析“论坛风格”标签的智能方法是什么？

3 个答案: