用于在字符串中定位不平衡括号的算法

时间:2011-01-03 21:52:28

标签: algorithm parsing

PostScript / PDF字符串文字用括号括起来,只要括号完全平衡,就允许包含未转义的括号。所以例如

( () )  % valid string constant
( ( )   % invalid string constant, the inner ( should be escaped

我知道一种算法告诉我是否字符串中的任何不平衡括号;我正在寻找的是一种算法,它将定位一组不平衡的括号,这样我就可以在它们前面加上反斜杠,使整个字符串有效。更多例子:

(     ⟶   \(
()    ⟶   ()
(()   ⟶   \(() or (\()
())   ⟶   ()\) or (\))
()(   ⟶   ()\(

2 个答案:

答案 0 :(得分:3)

基于标准堆栈的算法的修改以检测不平衡的括号应该适合您。这是一些伪代码:

void find_unbalaned_indices(string input)
{
    // initialize 'stack' containing of ints representing index at
    // which a lparen ( was seen

    stack<int index> = NIL          

    for (i=0 to input.size())
    {
        // Lparen. push into the stack
        if (input[i] == '(')
        {
            // saw ( at index=i
            stack.push(i);
        }
        else if (input[i] == ')')
        {
           out = stack.pop();
           if (out == NIL)
           {
               // stack was empty. Imbalanced RParen.
               // index=i needs to be escaped
               ... 
           }  
           // otherwise, this rparen has a balanced lparen.
           // nothing to do.
        }
    }

    // check if we have any imbalanced lparens
    while (stack.size() != 0)
    {
        out = stack.pop();
        // out is imbalanced
        // index = out.index needs to be escaped.
    }
}

希望这有帮助。

答案 1 :(得分:0)

def escape(s):
    return ''.join(r(')(', r('()', s)))

def r(parens, chars):
    return reversed(list(escape_oneway(parens, chars)))

def escape_oneway(parens, chars):
    """Given a sequence of characters (possibly already escaped),
    escape those close-parens without a matching open-paren."""
    depth = 0
    for x in chars:
        if x == parens[0]:
            depth += 1
        if x == parens[1]:
            if depth == 0:
                yield '\\' + x
                continue
            else:
                depth -= 1
        yield x