使用str_replace解析反斜杠转义

时间:2015-03-18 23:08:48

标签: php string replace escaping string-interpolation

假设我有一个函数templateMap,对于$array的每个子数组,它替换给定@nn(对于某些$string)的每次出现使用该子数组中的值,返回新的子数组数组。还要说我想允许用户反斜杠@符号(这意味着允许\也被反斜杠。)

例如:

function templateMap ($string, $array) {
    $newArray = array();
    foreach($array as $subArray) {
        foreach($subArray as $replacements) {
            ...
        }
    }
    return $newArray;
}


// for grouping mysql statements with parentheses
templateMap("(@)", array(" col1 < 5 && col2 > 6 ", " col3 < 3 || col4 > 7"));

这会产生

array("( col1 < 5 && col2 > 6 )", "( col3 < 3 || col4 > 7 )")

这是一个带有多个参数的更复杂的例子 - 可能不容易实现

templateMap("You can tweet @0 \@2 @1", array(
    array("Sarah", "ssarahtweetzz"),
    array("John", "jjohnsthetweetiest"),
    ...
));

/* output: 
array(
    "You can tweet Sarah @2 ssarahtweetzz",
    "You can tweet John @2 jjohnsthetweetiest"
)
*/

有没有办法通过一系列str_replace来电来实现这一目标? (与正则表达式或简单的状态机相反。)

我想到的一件事是将\@的出现替换为当前字符串中找不到的异常字符串,例如zzzzzz,但当然,你必须检查一下如果字符串在给定的字符串中,并相应地修改它。

2 个答案:

答案 0 :(得分:1)

在进行替换时,除了需要替换的@之外,不能有任何\@ ...所以我们必须摆脱所有\@序列。 但是当我们摆脱所有\@序列时,实际上不能有任何\\@ @(两个反斜杠后跟\\)序列的一部分。 为了摆脱%序列,我们可以使用新的转义字符%

具体来说,如果我们将%%作为?%?转义,那么我们可以将任何其他序列转义为?,其中?%?是任何字符,并保证{{1}可以解除转义,因为%永远不会单独出现在中间。

// wrapper for native strings to make chaining easier
class String {
    private $str;
    public function __construct ($str) {
        $this->str = $str;
    }
    public function replace ($search, $substitute) {
        return new self(str_replace($search, $substitute, $this->str));
    }
    public function toRaw () {
        return $this->str;
    }
}

function templateMap ($str, $arr) {
    $encodedStr = (new String($str))->replace('%', '%%')
        ->replace('\\\\', '?%?')->replace('\@', '!%!');
    $newArr = array();
    foreach($arr as $el) {
        $encodedStrPieces = explode("@", $encodedStr->toRaw());
        foreach($encodedStrPieces as $i => $piece) {
            $encodedStrPieces[$i] = (new String($piece))->replace("@", $el)
            ->replace('!%!', '@')->replace('?%?', '\\')
            ->replace('%%', '%')->toRaw();
        }
        $newArr[] = implode($el, $encodedStrPieces);
    }
    return $newArr;
}


$arr = templateMap("(@\@)", array("hello", "goodbye"));
var_dump($arr); // => ["(hello@)", "(goodbye@)"]

答案 1 :(得分:1)

我认为仅限于使用str_replace时的主要问题是您几乎无法控制哪些字符串被替换(因为所有事件都会被替换),并且在选择占位符时需要特别小心对于\@转义序列。两个插入的值组合可能会生成占位符字符串,因此在还原占位符替换时将其转换为@字符。

以下是一种蛮力的解决方案,试图解决这个问题。它一次检查一个占位符对照模板字符串,替换值和最终字符串,确保占位符不出现在任何这些字符串中,并且最初为\@引入的占位符数与该数字匹配占位符还原。您可能希望设置一个默认的占位符,而不是xyz(如零char或其他),最适合您,以避免不必要的处理。

可以使用两种替换模式(@@<n>)调用它,但目前它们不能混合使用。

这不是我写过的最漂亮的代码,但鉴于str_replace约束,它仍然是我的镜头,我希望它对你有所帮助。

function templateMap ($string, $array, $defaultPlaceholder = "xyz")
{
    $newArray = array();

    // Create an array of the subject string and replacement arrays
    $knownStrings = array($string);
    foreach ($array as $subArray) {
        if (is_array($subArray)) {
            $knownStrings = array_merge($knownStrings, array_values($subArray));
        }
        else {
            $knownStrings[] = $subArray;
        }
    }

    $placeHolder = '';

    while (true) {
        if (!$placeHolder) {
            // This is the first try, so let's try the default placeholder
            $placeHolder = $defaultPlaceholder;
        }
        else {
            // We've been here before - we need to try another placeholder
            $placeHolder = uniqid('bs-placeholder-', true);
        }

        // Try to find a placeholder that does not appear in any of the strings
        foreach ($knownStrings as $knownString) {
            // Does $placeHolder exist in $knownString?
            str_replace($placeHolder, 'whatever', $knownString, $count);
            if ($count > 0) {
                // Placeholder candidate was found in one of the strings
                continue 2; // Start over
            }
        }

        // Will go for placeholder "$placeHolder"
        foreach ($array as $subArray) {
            $newString = $string;

            // Apply placeholder for \@ - remember number of replacements
            $newString = str_replace(
                '\@', $placeHolder, $newString, $numberOfFirstReplacements
            );

            if (is_array($subArray)) {
                // Make substitution on @<n>
                for ($i = 0; $i <= 9; $i++) {
                    @$newString = str_replace("@$i", $subArray[$i], $newString);
                }
            }
            else {
                // Make substitution on @
                @$newString = str_replace("@", $subArray, $newString);
            }

            // Revert placeholder for \@ - remember number of replacements
            $newString = str_replace(
                $placeHolder, '@', $newString, $numberOfSecondReplacements
            );

            if ($numberOfFirstReplacements != $numberOfSecondReplacements) {
                // Darn - value substitution caused used placeholder to appear,
                // ruining our day - we need some other placeholder
                $newArray = array();
                continue 2;
            }

            // Looks promising
            $newArray[] = $newString;
        }

        // All is well that ends well
        break;
    }
    return $newArray;
}

$a = templateMap(
    "(@ and one escaped \@)",
    array(" col1 < 5 && col2 > 6", " col3 < 3 || col4 > 7")
);
print_r($a);

$a = templateMap(
    "You can tweet @0 \@2 @1",
    array(
        array("Sarah", "ssarahtweetz"),
        array("John", "jjohnsthetweetiest"),
    )
);
print_r($a);

输出:

Array
(
    [0] => ( col1 < 5 && col2 > 6 and one escaped @)
    [1] => ( col3 < 3 || col4 > 7 and one escaped @)
)
Array
(
    [0] => You can tweet Sarah @2 ssarahtweetz
    [1] => You can tweet John @2 jjohnsthetweetiest
)