C#Regex Expression检索Hierarchial字符串

时间:2014-09-19 10:46:56

标签: c# regex

如何在C#中使用regex表达式解析下面的字符串并返回匹配和匹配组集合中的内容?开始标记为[[和]]。有人可以帮忙吗?

[[Parent1 [[Child 1]],[[Child 2]],[[Child 3]]]] [[Parent2 [[Child 1]],[[Child 2]]]]

寻找输出如下。

item: Parent1
Children: [Child1, Child2, Child3]
item: Parent2
Children: [Child1, Child2]

4 个答案:

答案 0 :(得分:2)

你可以试试下面的正则表达式,

(?<=^|]\s)\[\[(\S+)|(\[\[(?!Parent).*?\]\])(?=]]\s|]]$)

组索引1包含父组件,组index2包含子组件。

DEMO

String input = @"[[Parent1 [[Child 1]],[[Child 2]],[[Child 3]]]] [[Parent2 [[Child 1]],[[Child 2]]]]";

Regex rgx = new Regex(@"(?<=^|]\s)\[\[(?<item>\S+)|(?<children>\[\[(?!Parent).*?\]\])(?=]]\s|]]$)");

foreach (Match m in rgx.Matches(input))
{
Console.WriteLine(m.Groups[1].Value);
Console.WriteLine(m.Groups[2].Value);
}

IDEONE

答案 1 :(得分:0)

((\[\[Parent\d\]\])(\[\[Child \d\]\])+\]\])+

怎么样?

未实际测试

答案 2 :(得分:0)

(?'parent'Parent\d )|(?!^)\G(?:\[\[(?'child'.*?)]]),?

在群组'父母'所有父母éléments和群组'孩子'所有孩子éléments

  using System;
  using System.Text.RegularExpressions;
  public class Test
    {
    public static void Main()
    {
     String input = @"[[Parent1 [[Child 1]],[[Child 2]],[[Child 3]]]] [[Parent2 [[Child 1]],[[Child 2]]]]";
     Regex rgx = new Regex(@"(?<parent>Parent\d )|(?!^)\G(?:\[\[(?<child>.*?)]]),?");
     foreach (Match m in rgx.Matches(input))
    {
    Console.WriteLine(m.Groups["parent"].Value);
    Console.WriteLine(m.Groups["child"].Value);
    }
    }
    }

Demo

答案 3 :(得分:0)

如何将其转换为更好理解的内容 - JSON:

string ConvertToJson(string input)
{
    var elements = input
        // replace all square brackets with quotes
        .Replace("[[", "\"").Replace("]]", "\"") 
        // fix double quotes
        .Replace("\"\"", "\"")
        // split on all space-quote combos
        .Split(new[] { " \"" }, StringSplitOptions.RemoveEmptyEntries)
        // make sure all elements start and end with a quote
        .Select(x => "\"" + x.Trim('"') + "\"")
        // make all odd elements the parent item and all even the children collection
        .Select((x, i) => (i % 2 == 0) 
            ? ("{\"item\":" + x) 
            : ",\"children\":[" + x + "]},");

    // turn back into string, remove unneeded comma at end and wrap in an array
    return "[" + String.Concat(elements).Trim(',') + "]";
}

输入:

[[Parent1 [[Child 1]],[[Child 2]],[[Child 3]]]] [[Parent2 [[Child 1]],[[Child 2]]]]

输出:

[{"item":"Parent1","children":["Child 1","Child 2","Child 3"]},{"item":"Parent2","children":["Child 1","Child 2"]}]

然后您可以使用JSON.NET或其他任何方式来使用它。

您还会注意到此解决方案并不要求父母被称为Parent,因为其他解决方案就是这样做的。作为奖励,看不到正则表达式......


为了完整性,这里是一个使用JSON.NET反序列化它的例子:

var list = JsonConvert.DeserializeObject<dynamic>(jsonString);

foreach (var item in list)
{
    Console.WriteLine("item: {0}", item.item);
    Console.WriteLine("Children: [{0}]", String.Join(", ", item.children));
}

输出

  

项目:Parent1
  儿童:[儿童1,儿童2,儿童3]
  item:Parent2
  儿童:[儿童1,儿童2]