BASH大括号扩展算法

时间:2015-02-15 12:24:43

标签: algorithm

我坚持这个算法问题:

设计一个解析这样一个表达式的算法: "((A,B,CY)N,M)"应该给: an - bn - cyn - m

表达式可以嵌套,因此: "((A,B)O(M,N)P,B)"解析; aomp - aonp - bomp - bonp - b。

我想过使用堆栈,但它太复杂了。 感谢。

2 个答案:

答案 0 :(得分:4)

您可以使用Recursive Descent Parser对其进行解析。

我们假设逗号分隔的字符串为components,因此对于表达式((a, b, cy)n, m)(a, b, cy)nm是两个组成部分。 abcy也是组件。所以这是一个递归定义。

对于组件(a, b, cy)n,我们说(a, b, cy)n是组件的两个component parts。稍后将组合部件组合以产生最终结果(即an - bn - cyn)。

我们假设expression是逗号分隔的组件,例如,(a, cy)n, m是一个表达式。它有两个组件(a, cy)nm,组件(a, cy)n包含两个组件(a, cy)n,组件(a, cy)是一个组件大括号表达式包含nested expressiona, cy,其中还有两个组件acy

使用这些定义(您可以使用其他术语),我们可以写下您的表达式的语法:

expression     = component, component, ...
component      = component_part component_part ...
component_part = letters | (expression)

一行是一条语法规则。第一行表示expression是逗号分隔的components列表。第二行表示可以使用一个或多个component构建component parts。第三行表示component part可以是连续的字母序列,也可以是一对括号内的嵌套表达式。

然后你可以使用Recursive Descent Parser来解决上述语法的问题。

我们将为每个语法规则定义一个方法/函数。所以基本上我们将有三种方法ParseExpressionParseComponentParseComponentPart

算法

如上所述,表达式以逗号分隔components,因此在我们的ParseExpression方法中,它只调用ParseComponent,然后检查下一个字符是否为逗号,像这样(我使用C#,我认为你可以轻松地将其转换为其他语言):

private List<string> ParseExpression()
{
    var result = new List<string>();

    while (!Eof())
    {
        // Parsing a component will produce a list of strings,
        // they are added to the final string list

        var items = ParseComponent();

        result.AddRange(items);

        // If next char is ',' simply skip it and parse next component
        if (Peek() == ',')
        {
            // Skip comma
            ReadNextChar();
        }
        else
        {
            break;
        }
    }

    return result;
}

你可以看到,当我们解析一个表达式时,我们递归调用ParseComponent(然后递归调用ParseComponentPart)。它是一种自上而下的方法,这就是为什么它被称为递归下降解析。

ParseComponent类似,如下所示:

private List<string> ParseComponent()
{
    List<string> leftItems = null;

    while (!Eof())
    {
        // Parse a component part will produce a list of strings (rightItems)
        // We need to combine already parsed string list (leftItems) in this component
        // with the newly parsed 'rightItems'
        var rightItems = ParseComponentPart();
        if (rightItems == null)
        {
            // No more parts, return current result (leftItems) to the caller
            break;
        }

        if (leftItems == null)
        {
            leftItems = rightItems;
        }
        else
        {
            leftItems = Combine(leftItems, rightItems);
        }
    }

    return leftItems;
}

combine方法简单地组合了两个字符串列表:

// Combine two lists of strings and return the combined string list
private List<string> Combine(List<string> leftItems, List<string> rightItems)
{
    var result = new List<string>();

    foreach (var leftItem in leftItems)
    {
        foreach (var rightItem in rightItems)
        {
            result.Add(leftItem + rightItem);
        }
    }

    return result;
}

然后是ParseComponentPart

private List<string> ParseComponentPart()
{
    var nextChar = Peek();

    if (nextChar == '(')
    {
        // Skip '('
        ReadNextChar();

        // Recursively parse the inner expression
        var items = ParseExpression();

        // Skip ')'
        ReadNextChar();

        return items;
    }
    else if (char.IsLetter(nextChar))
    {
        var letters = ReadLetters();

        return new List<string> { letters };
    }
    else
    {
        // Fail to parse a part, it means a component is ended
        return null;
    }
}

完整源代码(C#)

其他部分主要是辅助方法,完整的C#源代码如下:

using System;
using System.Collections.Generic;
using System.Text;

namespace Examples
{
    public class BashBraceParser
    {
        private string _expression;
        private int _nextCharIndex;

        /// <summary>
        /// Parse the specified BASH brace expression and return the result string list.
        /// </summary>
        public IList<string> Parse(string expression)
        {
            _expression = expression;
            _nextCharIndex = 0;

            return ParseExpression();
        }

        private List<string> ParseExpression()
        {
            // ** This part is already posted above **
        }

        private List<string> ParseComponent()
        {
            // ** This part is already posted above **
        }

        private List<string> ParseComponentPart()
        {
            // ** This part is already posted above **
        }

        // Combine two lists of strings and return the combined string list
        private List<string> Combine(List<string> leftItems, List<string> rightItems)
        {
            // ** This part is already posted above **
        }

        // Peek next char without moving the cursor
        private char Peek()
        {
            if (Eof())
            {
                return '\0';
            }

            return _expression[_nextCharIndex];
        }

        // Read next char and move the cursor to next char
        private char ReadNextChar()
        {
            return _expression[_nextCharIndex++];
        }

        private void UnreadChar()
        {
            _nextCharIndex--;
        }

        // Check if the whole expression string is scanned.
        private bool Eof()
        {
            return _nextCharIndex == _expression.Length;
        }

        // Read a continuous sequence of letters.
        private string ReadLetters()
        {
            if (!char.IsLetter(Peek()))
            {
                return null;
            }

            var str = new StringBuilder();

            while (!Eof())
            {
                var ch = ReadNextChar();
                if (char.IsLetter(ch))
                {
                    str.Append(ch);
                }
                else
                {
                    UnreadChar();
                    break;
                }
            }

            return str.ToString();
        }
    }
}

使用代码

var parser = new BashBraceParser();
var result = parser.Parse("((a,b)o(m,n)p,b)");

var output = String.Join(" - ", result);

// Result: aomp - aonp - bomp - bonp - b
Console.WriteLine(output);

答案 1 :(得分:3)

public class BASHBraceExpansion {



    public static ArrayList<StringBuilder> parse_bash(String expression, WrapperInt p) {

         ArrayList<StringBuilder> elements = new ArrayList<StringBuilder>();
         ArrayList<StringBuilder> result = new ArrayList<StringBuilder>();
         elements.add(new StringBuilder(""));

        while(p.index < expression.length())
        {
            if (expression.charAt(p.index) == '(')
            {
                p.advance();
                ArrayList<StringBuilder> temp = parse_bash(expression, p);
                ArrayList<StringBuilder> newElements = new ArrayList<StringBuilder>(); 
                for(StringBuilder e : elements)
                {
                    for(StringBuilder t : temp)
                    {
                        StringBuilder s = new StringBuilder(e);
                        newElements.add(s.append(t));
                    }
                }
                System.out.println("elements :");
                elements = newElements;




            }
            else if (expression.charAt(p.index) == ',')
            {
                result.addAll(elements);
                elements.clear();
                elements.add(new StringBuilder(""));
                p.advance();
            }
            else if (expression.charAt(p.index) == ')')
            {
                p.advance();
                result.addAll(elements);

                return result;
            }
            else
            {
                for(StringBuilder sb : elements)
                {
                    sb.append(expression.charAt(p.index));
                }
                p.advance();
            }
        }

        return elements;
    }

    public static void print(ArrayList<StringBuilder> list)
    {
        for(StringBuilder s : list)
        {
            System.out.print(s + " * ");
        }
        System.out.println();
    }
    public static void main(String[] args) {
        WrapperInt p = new WrapperInt();
    ArrayList<StringBuilder> list = parse_bash("((a,b)o(m,n)p,b)", p);
    //ArrayList<StringBuilder> list = parse_bash("(a,b)", p);
    WrapperInt q = new WrapperInt();

    ArrayList<StringBuilder> list1 = parse_bash("((a,b,cy)n,m)", q);
    ArrayList<StringBuilder> list2 = parse_bash("((a,b)dr(f,g)(k,m),L(p,q))", new WrapperInt());

    System.out.println("*****RESULT : ******");
    print(list);
    print(list1);
    print(list2);

    }

}


public class WrapperInt {
    public WrapperInt() {
        index = 0;
    }
    public int advance()
    {
        index ++;
        return index;
    }
    public int index;
}

// aomp - aonp - bomp - bonp - b.