将WHERE sql子句拆分为数组

时间:2013-07-15 07:44:42

标签: c# regex arrays split

我试图将一个字符串(它是SQL语句的WHERE子句)拆分为一个包含5个输出的数组,其中下面的数据保存在每个索引下:

0 - The initial clauses (WHERE/AND/OR) plus any open brackets. e.g "AND((("
1 - Either the table the first clause comes from or "VALUE" if its a value. e.g. "transactions". 
2 - The field name or value. e.g. "id"
3 - The joining value. e.g. >
4 - Either the table the second clause comes from or "VALUE" if its a value. e.g. "transactions". 
5 - The field name or value. e.g. "id"
6 - Any closing brackets. e.g. ")))"

例如循环遍历以下String将输出以下数组:

WHERE transactions.status_code= 'AFA 2'
AND (transactions.supp_ref = supplier.supp_ref
AND supplier.supp_addr_ref = address.addr_ref)
OR transactions.user_code = user.user_code

output[0] = "WHERE"
output[1] = "transactions"
output[2] = "status_code"
output[3] = "="
output[4] = "VALUE'
output[5] = "AFA 2"
output[6] = ""

output[0] = "AND("
output[1] = "transactions"
output[2] = "supp_ref"
output[3] = "="
output[4] = "supplier"
output[5] = "supp_ref"
output[6] = ""

output[0] = "AND"
output[1] = "supplier"
output[2] = "supp_addr_ref"
output[3] = "="
output[4] = "address"
output[5] = "addr_ref"
output[6] = ")"

output[0] = "OR"
output[1] = "transactions"
output[2] = "user_code"
output[3] = "="
output[4] = "user"
output[5] = "user_code"
output[6] = ""

对于SQL语句的其余部分,我已经使用String.Split方法以类似的方式成功地将其拆分,但是由于where子句的差异,我在这方面遇到了困难。从环顾四周看,我觉得我会更好地使用正则表达式,但无法解决所需的问题。任何帮助或指示都将非常感激。

2 个答案:

答案 0 :(得分:0)

好的,首先我认为正则表达式可能不适合你想要做的事情。这就是说这是一个正则表达式,它将解析你发布的内容并将其转换为你想要的内容:

(?<Group>(?<Concat>where|\s*?\)?\s*?and\s*?\(?|\s*?\)?\s*?or\s*?\(?)(?<TableName>[\w\s]+(?=\.))\.?(?<ColName>.+?(?=\=|like|between|\<\>|\>\=|\<\=|in|\>|\<))\s*?(?<Compare>\=|like|between|\<\>|\>\=|\<\=|in|\>|\<)(?<Value>.*?(?=\s*?and\s*?\(*|or\*?\(*)|.*))

我确信这并不涵盖所有内容,并且取决于正则表达式解析器,这可能会有不同的表现。我使用The Regulator进行正则表达式工作。

我建议编写一个执行此操作的解析器。看看下面的内容,如果你决定走这条路,可能会有所帮助。我不完全确定你在那里使用那个“VALUE”字符串做什么,但如果你想找出什么是值,什么是table.colName,你可以轻松地将其添加到此。识别('a','b')之类的东西会更难,但我认为它是可行的。

    //A list of chars that we are going to replace with \s"char"\s this list may not be complete.
    // . is not in here. We will take care of that later.
    static string[] specChars = new string[] { "<", ">", "<=", ">=", "=", "like", "in", "between", "or", "and", "(", ")", "where" };
    static string[] delims = new string[] {"and", "or", "where" };
    static string testData = @"WHERE transactions.status_code= 'AFA 2'
    AND (transactions.supp_ref = supplier.supp_ref
    AND supplier.supp_addr_ref = address.addr_ref)
    OR transactions.user_code = user.user_code";
    static void Main(string[] args)
    {
        Print(Parse(testData));
        Console.ReadKey();
    }

    static List<List<string>> Parse(string input)
    {
        List<List<string>> ret = new List<List<string>>();
        //lets remove all the spaces first becaue we are going to put them back
        //the way we want to see them.
        input = input.Replace(" ", "").Replace("\r", "").Replace("\n", "").ToLower();
        foreach (string item in specChars)
        {
            //this will help clean the string so you can use it
            input = input.Replace(item, string.Format(" {0} ", item));   
        }
        string[] splits = input.Split(' ');

        List<string> currList = null;
        foreach (string item in splits.Where(x => x.Length > 0))
        {
            if (delims.Contains(item))
            {
                if (currList != null)
                {
                    ret.Add(currList);
                    currList = new List<string>();
                    currList.Add(item);
                }
                else
                {
                    currList = new List<string>();
                    currList.Add(item);
                }
            }
            else
            {
                if (item.Contains("."))
                {
                    string[] tmp = item.Split('.');
                    currList.Add(tmp[0]);
                    currList.Add(tmp[1]);
                }
                else
                    currList.Add(item);
            }
        }
        if (currList != null)
            ret.Add(currList);
        return ret;
    }

    static void Print(List<List<String>> input)
    {
        StringBuilder sb = new StringBuilder();
        foreach (List<String> item in input)
        {
            sb.Append("New Chunk:\n");
            foreach (string str in item)
            {
                sb.Append(string.Format("\t{0}\n", str));
            }
            sb.Append("\n");
        }

        Console.WriteLine(sb.ToString());
    }
}

答案 1 :(得分:0)

如果您要解析SQL,可能需要查看ScriptDom命名空间。它可能不仅仅是你想要做的,但它有一些SQL解析器,可以为你提供有关给定SQL查询的详细信息。

以下是一些资源。

MSDN ScriptDOM reference
An easier introduction