找到圆形替换模式

时间:2014-08-07 15:09:30

标签: php regex replace cycle-detection

假设我想通过从字典中替换占位符来扩展字符串。替换字符串也可以包含占位符:

$pattern = "#a# #b#";
$dict = array("a" => "foo", "b" => "bar #c#", "c" => "baz");
while($match_count = preg_match_all('/#([^#])+#/', $pattern, $matches)) {
    for($i=0; $i<$match_count; $i++) {
        $key = $matches[1][$i];
        if(!isset($dict[$key])) { throw new Exception("'$key' not found!"); }
        $pattern = str_replace($matches[0][$i], $dict[$key], $pattern); 
     }   
}
echo $pattern;

只要没有圆形替换模式,例如"c" => "#b#",这样就可以正常工作。然后程序将被抛入无限循环,直到内存耗尽。

有没有简单的方法来检测这种模式?我正在寻找一种解决方案,其中替换之间的距离可以任意长,例如。 A-&GT; B-&GT; C-&GT; D-&GT; F-&gt;一种
理想情况下,解决方案也会在循环中发生,而不是单独分析。

2 个答案:

答案 0 :(得分:0)

单个字符键

如果键是单个字符,这很容易:只需检查值侧的字符串是否包含作为键的字符。

foreach ($your_array as $key => $value) {
    foreach(str_split($value) as $ch) {
        if(array_key_exists ($ch,$your_array) {
            #Problem, cycle is possible
        }
    }
}
#We're fine

现在即使存在循环,也不意味着它会在每个字符串上被触发(例如,在空字符串中,不会触发任何模式,因此没有循环)。在这种情况下,您可以将其合并到您的检查器中:如果第二次触发规则,则存在问题。仅仅因为如果是这种情况,之前的模式已经产生了这种情况,因此会一次又一次地产生这种情况。

字符串键

如果键也是字符串,这可能是 Post Correspondence Problem ,这是不可判定的......

答案 1 :(得分:0)

感谢georg和this post的评论,我想出了一个解决方案,将模式转换为图形并使用拓扑排序来检查循环替换。

这是我的解决方案:

$dict = array("a" => "foo", "b" => "bar #c#", "c" => "baz #b#");

# Store incoming and outgoing "connections" for each key => pattern replacement
$nodes = array();
foreach($dict as $patternName => $pattern) {
    if (!isset($nodes[$patternName])) {
        $nodes[$patternName] = array("in" => array(), "out" => array());
    }
    $match_count = preg_match_all('/#([^#])+#/', $pattern, $matches);
    for ($i=0; $i<$match_count; $i++) {
        $key = $matches[1][$i];
        if (!isset($dict[$key])) { throw new Exception("'$key' not found!"); }
        if (!isset($nodes[$key])) {
            $nodes[$key] = array("in" => array(), "out" => array());
        }
        $nodes[$key]["in"][]          = $patternName;
        $nodes[$patternName]["out"][] = $key;
     }   
}
# collect leaf nodes (no incoming connections)
$leafNodes = array();
foreach ($nodes as $key => $connections) {
    if (empty($connections["in"])) {
        $leafNodes[] = $key;
    }
}
# Remove leaf nodes until none are left
while (!empty($leafNodes)) {
    $nodeID = array_shift($leafNodes);
    foreach ($nodes[$nodeID]["out"] as $outNode) {
        $nodes[$outNode]['in'] = array_diff($nodes[$outNode]['in'], array($nodeID));
        if (empty($nodes[$outNode]['in'])) {
            $leafNodes[] = $outNode;
        }
    }
    $nodes[$nodeID]['out'] = array();
}
# Check for non-leaf nodes. If any are left, there is a circular pattern
foreach ($nodes as $key => $node) {
    if (!empty($node["in"]) || !empty($node["out"]) ) {
        throw new Exception("Circular replacement pattern for '$key'!");
    }
}

# Now we can safely do replacement 
$pattern = "#a# #b#";
while ($match_count = preg_match_all('/#([^#])+#/', $pattern, $matches)) {
    $key = $matches[1][$i];
    $pattern = str_replace($matches[0][$i], $dict[$key], $pattern); 
}
echo $pattern;