使用正则表达式扫描T-SQL以获取对象依赖性

时间:2011-12-04 13:32:26

标签: c# sql regex

我正在编写一个c#类库,它允许我扫描SQL服务器查询并将查询中的对象提取到正确的分组中,例如:

SELECT * FROM "My Server"."Northwind"."dbo"."Product Sales for 1997" Group By CategoryID

这个正则表达式将上面的字符串和组“My Server”,“Northwind”,“dbo”和“1997年的产品销售”分成四组,这就是我想要的。

(?i)\bFROM\b\s+[\["]([^\]"]*)[\]"].{1}[\["]([^\]"]*)[\]"].{1}[\["]([^\]"]*)[\]"].{1}[\["]([^\]"]*)[\]"].{1}

我正在寻找的是一个单一的正则表达式,它可以捕获以下任何组合的服务器名称,数据库名称,模式名称和对象名称(这不是一个详尽的列表):

SELECT * FROM dbo."Product Sales for 1997" // should return groups 2 & 3
SELECT * FROM Northwind."My Schema"."My view or table function" // should return  groups 1, 2 & 3
SELECT * FROM "My view or table function" // should return group 3
SELECT * FROM dbo."My View 1" AS V1 JOIN "My View 1" AS V2 ON V1.ID = V2 // should return groups 2 & 3

换句话说,我想将各种组件捕获到以下组中:

组0 - >服务器名称
第1组 - >数据库名称
第2组 - >架构
第3组 - >对象名称

我试图避免创建多个正则表达式来处理每个可能的组合以避免我的类库变得太大和复杂,但作为正则表达式n00b它证明有点困难。

2 个答案:

答案 0 :(得分:0)

使用正则表达式最好的方法是将其解析为令牌,然后必须确定组的实际值(服务器数据库等)。这是一个将数据示例放入此类令牌的正则表达式。注意我不知道sql server有引号,但是你的示例要求它们,所以我使用If条件(参见我的博客文章Regular Expressions and the If Conditional)分别寻找单引号和双引号转义为\ x22和\ x27。令牌被放置在匹配捕获中,在那里被提取。

string data =
@"SELECT * FROM dbo.""Product Sales for 1997"" // should return groups 2 & 3 
SELECT * FROM Northwind.""My Schema"".""My view or table function"" // should return  groups 1, 2 & 3 
SELECT * FROM ""My view or table function"" // should return group 3 
SELECT * FROM dbo.""My View 1"" AS V1 JOIN ""My View 1"" AS V2 ON V1.ID = V2 // should return groups 2 & 3 ";

string pattern = 
@"
(?:FROM\s+)                 # Work from a from only
(
  (?([\x27\x22])            # If a single or double quote is found      
     (?:[\x27\x22])
       (?<Tokens>[\w\s]+)   # process quoted text
     (?:[\x27\x22]\.?)
   |                        # else
     (?!\s+AS|\s+WHERE)     # if AS or Where is found stop the match we are done
     (?:\.?)
     (?<Tokens>\w+)         # Process non quoted token.
     (?:\.?)
   )
   (?![\n\r/])              # Stop on CR/LF or a comment. 
){0,4}                      # Only do this 1 to 4 times, for it can't be more (parser hint to stop)
";

Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace) // Ignore is to allow commenting of the pattern only (not data processing)
    .OfType<Match>()
    .Select(mt => mt.Groups["Tokens"]
                    .Captures.OfType<Capture>()
                    .Select(cp => cp.Value))
    .ToList() // To do the foreach below
    .ForEach(tokens => Console.WriteLine(string.Join(" | ", tokens)));

/* Output
dbo | Product Sales for 1997
Northwind | My Schema | My view or table function
My view or table function
dbo | My View 1
*/

答案 1 :(得分:0)

要解析任意SQL查询,使用SQL parser会更好。尝试用正则表达式解析任意SQL将相当于编写自己的解析器。

借助完整的SQL解析器,您可以轻松实现所需:

SELECT * FROM Northwind."My Schema"."My view or table function";

输出将是这样的:

select clause:
Columns
Fullname:*
Prefix: Column:*    alias:

from clause:
   Northwind."My Schema"."My view or table function"

database: Northwind
schema:   "My Schema"
object:   "My view or table function"
object alias:

您可以自己尝试this demo来测试更复杂的SQL查询。