用分号和分隔符分隔字符串并包含括号

时间:2012-09-28 14:16:29

标签: c# regex parsing

我需要用分号(;)作为分隔符来分隔字符串。括号内的分号应该被忽略。

示例:

string inputString = "(Apple;Mango);(Tiger;Horse);Ant;Frog;";

字符串的输出列表应为:

(Apple;Mango)
(Tiger;Horse)
Ant
Frog

其他有效的输入字符串可以是:

string inputString = "The fruits are (mango;apple), and they are good"

以上字符串应拆分为单个字符串

"The fruits are (mango;apple), and they are good"

string inputString = "The animals in (African (Lion;Elephant) and Asian(Panda; Tiger)) are endangered species; Some plants are endangered too."

上面的字符串应该分成两个字符串,如下所示:

"The animals in (African (Lion;Elephant) and Asian(Panda; Tiger)) are endangered species"
"Some plants are endangered too."

我搜索了很多,但找不到上述情况的答案。

有人知道如何在不重新发明轮子的情况下实现这一目标吗?

2 个答案:

答案 0 :(得分:1)

使用与您要保留的匹配的正则表达式,而不是分隔符:

string inputString = "(Apple;Mango);(Tiger;Horse);Ant;Frog;";

MatchCollection m = Regex.Matches(inputString, @"\([^;)]*(;[^;)]*)*\)|[^;]+");

foreach (Match x in m){
  Console.WriteLine(x.Value);
}

输出:

(Apple;Mango)
(Tiger;Horse)
Ant
Frog

表达评论:

\(           opening parenthesis
[^;)]*       characters before semicolon
(;[^;)]*)*   optional semicolon and characters after it
\)           closing parenthesis
|            or
[^;]+        text with no semicolon

注意:上面的表达式也接受括号中没有分号的值,例如: (Lark)和多个分号,例如(Lark;Pine;Birch)。它还会跳过空值,例如";;Pine;;;;Birch;;;"将是两个项目,而不是十个项目。

答案 1 :(得分:0)

将“禁止案件”与“正常”案件分开处理,以确保在前者中省略分号。

实现此目的的正则表达式(匹配输入中的单个元素)可能如下所示(未经测试):

"\([A-Za-z;]+\)|[A-Za-z]+"