分离字符串而不删除分隔符并将分隔符保持在起始位置?

时间:2016-11-29 07:09:45

标签: c# string

这是我的字符串。

string content =    
  @"[INFO ] | 2016-11-28 10:56:19.68 | level to ""Info""
    [INFO ] | 2016-11-28 10:56:56.93 | to ""Info""
    [DEBUG ] | 2016-11-28 10:56:56.93 | been initialized successfully.
    [INFO ] | 2016-11-28 11:01:14.05 | to ""Info""
    [ERROR] | 2016-11-28 11:01:14.05 | initialized successfully."

这是我的字符串内容,我想使用以下分隔符[INFO ][ERROR ][DEBUG ]分割我的字符串,但我不想删除我使用正则表达式正面的单词返回,但他们最后附加了分隔符enter code here 我希望在原始位置使用分隔符:

我想要像这样的分裂字符串

      1=>[INFO ] | 2016-11-28 10:56:19.68 | level to "Info"
      2=>[INFO ] | 2016-11-28 10:56:56.93 | to "Info"
      3=>[DEBUG ] | 2016-11-28 10:56:56.93 | been initialized successfully.
      4=>[INFO ] | 2016-11-28 11:01:14.05 | to "Info"
      5=>[ERROR] | 2016-11-28 11:01:14.05 | initialized successfully."

2 个答案:

答案 0 :(得分:1)

而不是拆分,我建议匹配并使用正则表达式

  string content = 
    @"[INFO ] | 2016-11-28 10:56:19.68 | level to ""Info""
      [INFO ] | 2016 - 11 - 28 10:56:56.93 | to ""Info""
      [DEBUG ] | 2016 - 11 - 28 10:56:56.93 | been initialized successfully.
      [INFO ] | 2016-11-28 11:01:14.05 | to ""Info""
      [ERROR] | 2016-11-28 11:01:14.05 | initialized successfully.";

   // square brackets []
   // with uppercase text or spaces within it
   // followed by any characters
   // up to the end of line or end of the entire text
  string pattern = @"(\[[A-Z ]+\].+?)(?:\z|\n|\r)";

  var result = Regex
    .Matches(content, pattern, RegexOptions.Multiline)
    .OfType<Match>()
  // .Select(match => match.Groups[1].Value}) // if you want just a match
    .Select((match, index) => $"{index + 1}=>{match.Groups[1].Value}");
  // .ToArray(); // <- you may want to materialize the result into, say, an array

测试:

  Console.Write(string.Join(Environment.NewLine, result));

结果:

 1=>[INFO ] | 2016-11-28 10:56:19.68 | level to "Info"
 2=>[INFO ] | 2016 - 11 - 28 10:56:56.93 | to "Info"
 3=>[DEBUG ] | 2016 - 11 - 28 10:56:56.93 | been initialized successfully.
 4=>[INFO ] | 2016-11-28 11:01:14.05 | to "Info"
 5=>[ERROR] | 2016-11-28 11:01:14.05 | initialized successfully.

答案 1 :(得分:1)

我无耻地开始使用@DmitryBychenko回答并试图改进它。

如果您想支持多行条目以及更准确地匹配确切的分隔符"[INFO ]""[DEBUG ]""[ERROR ]",您可以使用以下正则表达式:

var pattern = @"(\[INFO \]|\[DEBUG \]|\[ERROR \]).+?(?=\[INFO \]|\[DEBUG \]|\[ERROR \]|\z)";

var matches = System.Text.RegularExpressions.Regex.Matches(content, pattern, RegexOptions.Singleline)
    .OfType<Match>()
    .Select((match, index) => index + "=>" + match.Groups[0].Value.Trim());

它与指定的分隔符("(\[INFO \]|\[DEBUG \]|\[ERROR \])"的{​​{1}}部分)匹配并继续匹配,直到到达下一个分隔符(这是pattern部分)。

这转换

".+?(?=\[INFO \]|\[DEBUG \]|\[ERROR \]|\z)"

@"[INFO ] | 2016-11-28 10:56:19.68 | level to ""Info""
[INFO ] | 2016-11-28 10:56:56.93 | to ""Info""
[DEBUG ] | 2016-11-28 10:56:56.93 | been initialized successfully.
[INFO ] | 2016-11-28 11:01:14.05 | to ""Info""
More info in second line
[IRRELEVANT TAG] | Noone knows what this is | ""Whatever""
[ERROR ] | 2016-11-28 11:01:14.05 | initialized successfully."