在C#中拆分字符串时保留(并关联)分隔符

时间:2014-08-31 02:34:54

标签: c# split

在拆分字符串时,我想生成一系列令牌分隔符对。因此,以,;作为我的分隔符,我希望" a , b;"生成new int[][]{{" a ",","},{" b",";"},{"",""}}。最后一个条目表示字符串以分隔符结尾。当然,两个连续的分隔符用空标记分隔。

1 个答案:

答案 0 :(得分:1)

String.SplitRegex.Split都不允许这样的关联 - 结果总是一串字符串。即使还在序列as so中捕获拆分令牌,分隔符也将被混合。

但是,使用Regex.Matches(或Match / NextMatch)可以轻松完成此任务。诀窍是使用\G锚点(参见Anchors in Regular Expressions),使得匹配是增量的,并从前一个匹配中恢复。

var input = @" a , b;whatever";

// The \G anchor ensures the next match begins where the last ended.
// Then non-greedily (as in don't eat the separators) try to find a value.
// Finally match a separator.
var matches = Regex.Matches(input, @"\G(.*?)([,;])")
    .OfType<Match>();

// All the matches, deal with pairs as appropriate - here I simply group
// them into strings, but build a List of Pairs or whatnot.
var res = matches
    .Select(m => "{" + m.Groups[1].Value + "|" + m.Groups[2].Value + "}");
// res -> Enumerable with "{ a |,}", "{ b|;}" 

String trailing;
var lastMatch = matches.LastOrDefault();
if (lastMatch != null) {
    trailing = input.Substring(lastMatch.Index + lastMatch.Length);
    // If the separator was at the end, trailing is an empty string
} else {
    // No matches, the entire input is trailing.
    trailing = input;
}

// trailing -> "whatever"

根据需要填写详细信息(并解决任何问题)。为了整洁,请根据需要修改此代码并将其放在方法中。