嵌套占位符替换PHP

时间:2012-02-06 10:13:54

标签: php regex

我的字符串包含“{variant 1|variant 2}”等占位符,其中“|”表示“或”;我想获得没有占位符的所有字符串变体。例如,如果我使用字符串“{a|b{c|d}}”,我会收到字符串“a”,“bc”和“bd”。 我尝试通过正则表达式\{([^{}])\}(它在我的情况下为{c|d}得到最后一级)来获取它,但我在下一步有两个字符串:{a|bc}和{{1} },将生成“{a|bd}”,“a”,“bc”,“a”。可能是我需要创建一些图形或树形结构? 我也想问一下(?[^ {} | $] *) 为什么有“$”?我把它删除了,没有效果。

2 个答案:

答案 0 :(得分:1)

假设| {}是保留字符(不允许作为变体的内容,以下是解决问题的正则表达式方法。请注意,编写简单的状态机解析器将是更好的选择。

<?php // Using PHP5.3 syntax

// PCRE Recursive Pattern
// http://php.net/manual/en/regexp.reference.recursive.php

$string = "This test can be {very {cool|bad} in random order|or be just text} ddd {a|b{c|d}} bar {a|b{c{d|e|f}}} lala {b|c} baz";

if (preg_match_all('#\{((?>[^{}]+)|(?R))+\}#', $string, $matches, PREG_SET_ORDER)) {
    foreach ($matches as $match) {
        // $match[0] == "{a|b{c|d}}" | "{a|b{c{d|e|f}}}" | "{b|c}"
        // have some fun splitting them up
        // I'd suggest walking the characters and building a tree
        // a simpler (slower, uglyer) approach:

        // remove {}
        $set = substr($match[0], 1, -1);
        while (strpos($set, '{') !== false) {
            // explode and replace nested {}
            // reserved characters: "{" and "}" and "|"
            // (?<=^|\{|\|) -- a substring needs to begin with "|" or "{" or be the start of the string,
            //  "?<=" is a positive look behind assertion - the content is not captured
            // (?<prefix>[^{|]+) -- is the prefix, preceeding literal string (anything but reserved characters)
            // \{(?<inner>[^{}]+)\} -- is the content of a nested {} group, excluding the "{" and "}"
            // (?<postfix>[^|}$]*) -- is the postfix, trailing literal string (anything but reserved characters)
            // readable: <begin-delimiter><possible-prefix>{<nested-group>}<possible-postfix>
            $set = preg_replace_callback('#(?<=^|\{|\|)(?<prefix>[^{}|]*)\{(?<inner>[^{}]+)\}(?<postfix>[^{}|$]*)#', function($m) {
                $inner = explode('|', $m['inner']);
                return $m['prefix'] . join($inner, $m['postfix'] . '|' . $m['prefix']) . $m['postfix'];
            }, $set);
        }

        // $items = explode('|', $set);
        echo "$match[0] expands to {{$set}}\n";
    }
}

/*
    OUTPUT:
    {very {cool|bad} in random order|or be just text} expands to {very cool in random order|very bad in random order|or be just text}
    {a|b{c|d}} expands to {a|bc|bd}
    {a|b{c{d|e|f}}} expands to {a|bcd|bce|bcf}
    {b|c} expands to {b|c}
*/

答案 1 :(得分:0)

检查此代码:

$str = "This test can be {very {cool|bad} in random order|or be just text}";

function parseVarians($str, $buffer = array()) {
    if (empty($buffer)) $buffer['tokens'] = array();
    $newStr = preg_replace_callback('|\{([^{}]+)\}|', function($m) use(&$buffer) {
        $buffer['tokens'][] = explode('|', $m[1]);
        $index = count($buffer['tokens']) - 1;
        return '__' . $index;
    }, $str);

    if ($str != $newStr && strpos($newStr, '{') !== false) {
        return parseVarians($newStr, $buffer);
    }
    else {
        $buffer['str'] = $newStr;
        return $buffer;
    }
}

function devergeVariants($data) {
    krsort($data['tokens']);
    $strings  = array($data['str']);

    foreach ($data['tokens'] as $key => $token) {
        $variants = array();
        foreach ($token as $tok) {
            foreach ($strings as $str) {
                $variants[] = str_replace('__' . $key, $tok, $str);
            }
        }
        $strings = $variants;
    }

    return array_unique($strings);
}

echo '<pre>'; print_r($str); echo '</pre>';

$tokens = parseVarians($str);
//echo '<pre>'; print_r($tokens); echo '</pre>';
$result = devergeVariants($tokens);

echo '<pre>'; print_r( $result ); echo '</pre>';

输出:

This test can be {very {cool|bad} in random order|or be just text}
Array
(
    [0] => This test can be very cool in random order
    [1] => This test can be or be just text
    [2] => This test can be very bad in random order
)

好像你想要的?