我试图将一个字符串(它是SQL语句的WHERE子句)拆分为一个包含5个输出的数组,其中下面的数据保存在每个索引下:
0 - The initial clauses (WHERE/AND/OR) plus any open brackets. e.g "AND((("
1 - Either the table the first clause comes from or "VALUE" if its a value. e.g. "transactions".
2 - The field name or value. e.g. "id"
3 - The joining value. e.g. >
4 - Either the table the second clause comes from or "VALUE" if its a value. e.g. "transactions".
5 - The field name or value. e.g. "id"
6 - Any closing brackets. e.g. ")))"
例如循环遍历以下String将输出以下数组:
WHERE transactions.status_code= 'AFA 2'
AND (transactions.supp_ref = supplier.supp_ref
AND supplier.supp_addr_ref = address.addr_ref)
OR transactions.user_code = user.user_code
output[0] = "WHERE"
output[1] = "transactions"
output[2] = "status_code"
output[3] = "="
output[4] = "VALUE'
output[5] = "AFA 2"
output[6] = ""
output[0] = "AND("
output[1] = "transactions"
output[2] = "supp_ref"
output[3] = "="
output[4] = "supplier"
output[5] = "supp_ref"
output[6] = ""
output[0] = "AND"
output[1] = "supplier"
output[2] = "supp_addr_ref"
output[3] = "="
output[4] = "address"
output[5] = "addr_ref"
output[6] = ")"
output[0] = "OR"
output[1] = "transactions"
output[2] = "user_code"
output[3] = "="
output[4] = "user"
output[5] = "user_code"
output[6] = ""
对于SQL语句的其余部分,我已经使用String.Split方法以类似的方式成功地将其拆分,但是由于where子句的差异,我在这方面遇到了困难。从环顾四周看,我觉得我会更好地使用正则表达式,但无法解决所需的问题。任何帮助或指示都将非常感激。
答案 0 :(得分:0)
好的,首先我认为正则表达式可能不适合你想要做的事情。这就是说这是一个正则表达式,它将解析你发布的内容并将其转换为你想要的内容:
(?<Group>(?<Concat>where|\s*?\)?\s*?and\s*?\(?|\s*?\)?\s*?or\s*?\(?)(?<TableName>[\w\s]+(?=\.))\.?(?<ColName>.+?(?=\=|like|between|\<\>|\>\=|\<\=|in|\>|\<))\s*?(?<Compare>\=|like|between|\<\>|\>\=|\<\=|in|\>|\<)(?<Value>.*?(?=\s*?and\s*?\(*|or\*?\(*)|.*))
我确信这并不涵盖所有内容,并且取决于正则表达式解析器,这可能会有不同的表现。我使用The Regulator进行正则表达式工作。
我建议编写一个执行此操作的解析器。看看下面的内容,如果你决定走这条路,可能会有所帮助。我不完全确定你在那里使用那个“VALUE”字符串做什么,但如果你想找出什么是值,什么是table.colName,你可以轻松地将其添加到此。识别('a','b')之类的东西会更难,但我认为它是可行的。
//A list of chars that we are going to replace with \s"char"\s this list may not be complete.
// . is not in here. We will take care of that later.
static string[] specChars = new string[] { "<", ">", "<=", ">=", "=", "like", "in", "between", "or", "and", "(", ")", "where" };
static string[] delims = new string[] {"and", "or", "where" };
static string testData = @"WHERE transactions.status_code= 'AFA 2'
AND (transactions.supp_ref = supplier.supp_ref
AND supplier.supp_addr_ref = address.addr_ref)
OR transactions.user_code = user.user_code";
static void Main(string[] args)
{
Print(Parse(testData));
Console.ReadKey();
}
static List<List<string>> Parse(string input)
{
List<List<string>> ret = new List<List<string>>();
//lets remove all the spaces first becaue we are going to put them back
//the way we want to see them.
input = input.Replace(" ", "").Replace("\r", "").Replace("\n", "").ToLower();
foreach (string item in specChars)
{
//this will help clean the string so you can use it
input = input.Replace(item, string.Format(" {0} ", item));
}
string[] splits = input.Split(' ');
List<string> currList = null;
foreach (string item in splits.Where(x => x.Length > 0))
{
if (delims.Contains(item))
{
if (currList != null)
{
ret.Add(currList);
currList = new List<string>();
currList.Add(item);
}
else
{
currList = new List<string>();
currList.Add(item);
}
}
else
{
if (item.Contains("."))
{
string[] tmp = item.Split('.');
currList.Add(tmp[0]);
currList.Add(tmp[1]);
}
else
currList.Add(item);
}
}
if (currList != null)
ret.Add(currList);
return ret;
}
static void Print(List<List<String>> input)
{
StringBuilder sb = new StringBuilder();
foreach (List<String> item in input)
{
sb.Append("New Chunk:\n");
foreach (string str in item)
{
sb.Append(string.Format("\t{0}\n", str));
}
sb.Append("\n");
}
Console.WriteLine(sb.ToString());
}
}
答案 1 :(得分:0)
如果您要解析SQL,可能需要查看ScriptDom命名空间。它可能不仅仅是你想要做的,但它有一些SQL解析器,可以为你提供有关给定SQL查询的详细信息。
以下是一些资源。