以免我患上一些颤抖的颤抖(大约在杜鹃和阿特拉斯之间的交叉) 腕管综合症),我需要找到一种自动解析大文件的方法 sql语句及其参数值。
我有一个文件,其格式为sql语句:
select Animal#, RacketThreshold, PeakOil as Oil
from OilAnimalPlatypus2
where OilAnimalPlatypusID = :ID
and Animal# = :Animal
and TelecasterAccessType = 'D'
UNION
select Animal, RacketThreshold, PeakOil as Oil
from OilRequestPlatypus
where PlatypusID = :ID
and Animal = :Animal
order by RacketThreshold
-->ID(VARCHAR[0])=<NULL>
:Animal(INTEGER)=2
...即多行sql语句后跟一个空行后跟两个破折号和 带有参数名称,数据类型和参数的箭头,后跟相同的内容 无限期地无休止地广告(除了sql语句没有参数的地方)。
我想从这个伟大的goo gob中为每个独特的查询创建一个单独的字符串(很多 它们是相同的,虽然通常会分配不同的参数值 PARAMS)。如果可能的话,我还想跟踪传递给特定查询的所有参数值(例如,如果它是第一次被调用并且为特定参数传递“1”,则下一次是“42”,下一次“3.14”等),我希望这个arg名称的集合为1,42,3.14。
有超过400个查询,我不喜欢“手动”完成所有操作 - 特别是比较查询匹配。
好的,在添加此代码后使用Jon的:
private void buttonOpenAndParseSQLMonFile_Click(object sender, EventArgs e)
{
var queriesAndArgs = (Dictionary<string, List<string>>)ParseFile("SQLMonTraceLog.txt");
foreach(var pair in queriesAndArgs)
{
richTextBoxParsedResults.AppendText(pair.Key);
richTextBoxParsedResults.AppendText(Environment.NewLine);
foreach (String s in pair.Value)
{
richTextBoxParsedResults.AppendText(s);
richTextBoxParsedResults.AppendText(Environment.NewLine);
}
richTextBoxParsedResults.AppendText(Environment.NewLine);
}
}
...我在我的richTextBox中得到了这些类型的结果:
select ABCID from ABCWorker where lower(loginid) = lower(user)
select r.roleid from abcrole r, abcworker w where lower(w.loginid)=lower(user) and r.abcid=w.abcid and r.status='A'
select Tier#, BenGrimm, PeakRate as Ratefrom RageAnimalGreenBayPackers2 where RageAnimalGreenBayPackersID = :ID and Tier# =
:Tier and FlyingVAccessType = 'D' UNION select Tier, BenGrimm, PeakRate as Rate from CaliforniaCondorGreenBayPackers where
GreenBayPackersID = :ID and Tier = :Tier order by BenGrimm
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=1
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=1
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=1
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=4
select Tier#, BenGrimm, PeakRate as Rate from RageAnimalGreenBayPackers2 where RageAnimalGreenBayPackersID = :ID and Tier# =
:Tier and FlyingVAccessType = 'D' UNION select Tier, BenGrimm, PeakRate as Rate from CaliforniaCondorGreenBayPackers where
GreenBayPackersID = :ID and Tier = :Tier order by BenGrimm
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=2
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=5
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=1
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=2
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=3
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=4
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=2
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=3
--> :ID(VARCHAR[0])=<NULL> :Tier(INTEGER)=4
(etc.)
...所以,这非常有启发性,但我发现它不是我需要的东西,而且还取决于我的lamo手工调整文件。所以,我想我需要退后一步解析文件,因为它实际上是给我的,每个“有趣”的事件都有递增的数字:
. . .
6 11:30:46 SQL Execute: select ABCID
from ABCWorker
where lower(loginid) = lower(user)
7 11:30:46 SQL Prepare: select r.roleid from abcrole r, abcworker w where lower(w.loginid)=lower(user) and
r.abcid=w.abcid and r.status='A'
8 11:30:46 SQL Execute: select r.roleid from abcrole r, abcworker w where lower(w.loginid)=lower(user) and
r.abcid=w.abcid and r.status='A'
9 11:30:46 SQL Execute: select Tier#, BenGrimm, PeakRate as Rate
from RageAnimalGreenBayPackers2
where RageAnimalGreenBayPackersID = :ID
and Tier# = :Tier
and FlyingVAccessType = 'D'
UNION
select Tier, BenGrimm, PeakRate as Rate
from CaliforniaCondorGreenBayPackers
where GreenBayPackersID = :ID
and Tier = :Tier
order by BenGrimm
10 11:30:46 :ID(VARCHAR[0])=<NULL>
:Tier(INTEGER)=1
11 11:30:46 SQL Execute: select Tier#, BenGrimm, PeakRate as Rate
from RageAnimalGreenBayPackers2
where RageAnimalGreenBayPackersID = :ID
and Tier# = :Tier
and FlyingVAccessType = 'D'
UNION
select Tier, BenGrimm, PeakRate as Rate
from CaliforniaCondorGreenBayPackers
where GreenBayPackersID = :ID
and Tier = :Tier
order by BenGrimm
12 11:30:46 :ID(VARCHAR[0])=<NULL>
:Tier(INTEGER)=2
. . .
答案 0 :(得分:1)
你真正需要的是一个词法分析器。查看ANTLR - http://www.antlr.org/
您需要定义“语法”,即语言的每个元素的特征(在这种情况下是您的SQL文件)。然后最后,ANTLR处理你的文件并根据我们的语法定义吐出结果。
这只是一个标记化和解析过程。
答案 1 :(得分:1)
这是我评论的一个具体例子;您可以通过使用StreamReader完成此操作并将每个块收集到List中;例如:
string line = String.Empty;
List<String> statementBlocks = new List<String>();
System.IO.StreamReader file = new System.IO.StreamReader("C:\\temp\\annoying_text_file.sql");
StringBuilder blockCollector = new StringBuilder();
//read the file a line at a time
while((line = file.ReadLine()) != null)
{
//If the line has content, then we append it to our string builder
if(!String.IsNullOrWhitespace(line)) //String.IsNullOrWhitespace is new in .Net 4 and will also match the new line
{
blockCollector.AppendLine(line);
}
else
{
//we've hit a blank line - dump it to the list and reinitialize the stringbuilder
statementBlocks.Add(blockCollector.ToString();
statementBlocks = new StringBuilder();
}
}
//Tidy up
file.Close();
foreach(string statementBlock in statementBlocks)
{
if(!String.IsNullOrEmpty(statementBlock))
{
if(statememtBlock.StartsWith("-->"))
{
//Code to split out the arguments; if they are delimited with : then you can just string.split this line
//string[] paramsAndValues = line.Replace("-->", String.Empty).Split(Char.Parse(":"))
// then for each string in here it's paramName(DataType)=Value, which is also splittable.
}
else
{
//Do whatever you want with this valid block (including writing it to another file!)
//To keep only the unique ones, store each block in a list, then look to see if a block already exists in the list each time; if it does, just skip this block. Given you also know that the next block will be a parameter block, you can also collect the parameters here too
}
}
}
我现在无法检查此编译,但它应该让您对可能的方法做出一般意识。
假设只有空行是语句块之间的空行。
答案 2 :(得分:1)
假设您正在通过另一个空行将查询彼此分开,您可以尝试使用以下内容来解析文件。代码将读取文件直到结束。每次调用parseQuery都将读取行,直到找到一个空行,并将它们作为查询附加在一起。然后它将检查下一行,如果它不是参数块的开头,它将保存没有参数的查询,并重新开始,假设它在另一个查询的开头。如果该行是参数块的开头,则代码将读取,直到它到达另一个空行,保存查询及其参数,然后返回。 while(parseQuery)将确保整个文件被解析。
最后,代码吐出一个包含查询字符串作为键的字典,以及一个字符串列表作为提供的不同参数。为简单起见,省略了错误检查。在实际场景中,您需要为文件不存在等事物添加处理。
static IDictionary<string, List<string>> ParseFile(string path)
{
Dictionary<string, List<string>> queries = new Dictionary<string, List<string>>();
using (var reader = File.OpenText(path))
{
while (parseQuery(reader, queries)) { }
}
return queries;
}
private static bool parseQuery(StreamReader reader, Dictionary<string, List<string>> queries)
{
StringBuilder sbQuery = new StringBuilder();
StringBuilder sbArgs = new StringBuilder();
// Read in query
bool moreLines = ParseBlock(reader, sbQuery);
if (moreLines)
{
while (moreLines)
{
string line = reader.ReadLine();
// Check for the beginning of an args block.
if (line != null && line.StartsWith("-->"))
{
// Read in args
sbArgs.Append(line);
moreLines = ParseBlock(reader, sbArgs);
break;
}
// If this is not an args block, it is a new query
// Save the last query and start over
else
{
AddQuery(queries, sbQuery.ToString(), sbArgs.ToString());
sbQuery = new StringBuilder();
sbQuery.Append(line); // Make sure we capture the last line
moreLines = ParseBlock(reader, sbQuery);
}
}
}
AddQuery(queries, sbQuery.ToString(), sbArgs.ToString());
return moreLines;
}
private static bool ParseBlock(StreamReader reader, StringBuilder builder)
{
string line;
while ((line = reader.ReadLine()) != null)
{
line = line.Trim();
if (string.IsNullOrWhiteSpace(line)) break;
builder.Append(line + " ");
}
return line != null;
}
private static void AddQuery(Dictionary<string, List<string>> queries, string query, string args)
{
if (query.Length > 0)
{
List<string> lstParams;
if (!queries.TryGetValue(query, out lstParams))
{
lstParams = new List<string>();
}
lstParams.Add(args);
queries[query] = lstParams;
}
}