我已经使用Regex将包含层次结构的文件转换为指定的格式,但感觉应该有更好的方法,因为我必须手动确定父节点。正则表达式似乎是自然的选择,因为文件有一些复杂性(我从这个例子中删除了)Regex处理得很好。不过我可以说服不然。
这是问题所在。层次结构由空格缩进表示。实施例
TopLevel
Next Level
Leaf 1:24
Leaf 2:62
Another 2nd Level
Leaf 3:1
Leaf 4:4788
Top Level 2
Lower Level
Leaf 5:28298
Last Level 2
Leaf 6:9871
Leaf 7:3
需要有效地转换为字典。这是以下计划的结果。
TopLevel.Next Level.Leaf 1=24
TopLevel.Next Level.Leaf 2=62
TopLevel.Another 2nd Level.Leaf 3=1
TopLevel.Another 2nd Level.Leaf 4=4788
TopLevel 2.Lower Level.Leaf 5=28298
TopLevel 2.Last Level 2.Leaf 6=9871
TopLevel 2.Last Level 2.Leaf 7=3
我的解决方案如下。事实上,我必须搜索捕获组以找出错误的父节点。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace ConsoleApplicationTestHierarchyTextToDictionary
{
class Program
{
private const string TestFileContents =
@"TopLevel
Next Level
Leaf 1:24
Leaf 2:62
Another 2nd Level
Leaf 3:1
Leaf 4:4788
Top Level 2
Lower Level
Leaf 5:28298
Last Level 2
Leaf 6:9871
Leaf 7:3
";
private const string ContentLevel1 = "(?<Level1Group>ContentLevel1Header(ContentLevel2)+)+";
private const string ContentLevel2 = "(?<Level2Group>ContentLevel2Header(ContentDetail)+)";
private const string ContentLevel1Header = "^(?<Level1HeaderName>IdentifierName)\\s*$\\n";
private const string ContentLevel2Header = "^\\s(?<Level2HeaderName>IdentifierName)\\s*$\\n";
private const string ContentDetail = "^\\s{2}(?<DetailName>IdentifierName)\\s*:\\s*(?<DetailValue>\\d*)\\s*$\\n";
private const string IdentifierName = "(\\w([\\s\\t\\w]*\\w)?)";
private static readonly string Expression =
ContentLevel1
.Replace("(ContentLevel1)", ContentLevel1)
.Replace("ContentLevel1Header", ContentLevel1Header)
.Replace("(ContentLevel2)", ContentLevel2)
.Replace("ContentLevel2Header", ContentLevel2Header)
.Replace("ContentDetail", ContentDetail)
.Replace("IdentifierName", IdentifierName);
private static readonly Regex regex = new Regex(Expression, RegexOptions.Compiled | RegexOptions.Multiline);
static void Main(string[] args)
{
var result = new Dictionary<string, int>();
Match match = regex.Match(TestFileContents);
for (int i = 0; i < match.Groups["DetailName"].Captures.Count; i++)
{
Capture detailNameCapture = match.Groups["DetailName"].Captures[i];
string detailName = detailNameCapture.Value;
string detailValue = match.Groups["DetailValue"].Captures[i].Value;
// This feels wrong
Capture level2Group = match.Groups["Level2Group"].Captures.Cast<Capture>().FirstOrDefault(c => c.Contains(detailNameCapture));
Capture level2Header = match.Groups["Level2HeaderName"].Captures.Cast<Capture>().FirstOrDefault(c => level2Group.Contains(c));
Capture level1Group = match.Groups["Level1Group"].Captures.Cast<Capture>().FirstOrDefault(c => c.Contains(detailNameCapture));
Capture level1Header = match.Groups["Level1HeaderName"].Captures.Cast<Capture>().FirstOrDefault(c => level1Group.Contains(c));
string keyName = String.Format("{0}.{1}.{2}", level1Header, level2Header, detailName);
result[keyName] = Int32.Parse(detailValue);
}
Console.ReadKey();
}
}
static class CaptureHelper
{
public static bool Contains(this Capture source, Capture test)
{
return source.Index <= test.Index && (source.Index + source.Length) >= (test.Index + test.Length);
}
}
}
有没有更清洁的方法来实现这种效果?