我之前用平衡正则表达式做了这个,当时我只有一个平衡字符......但是有更多的平衡字符会变得更加复杂和丑陋。
对于我目前的目的,我改为通过对字符串进行标记来编写一个方法,但它非常慢(而且效率极低)最昂贵的部分似乎是我正在做的无偿的子串使用(是的,我知道它很糟糕) )。
基本上,我想采取以下
hello("(abc d)", efg (hijk)) and,some more<%lmn, "o(\")pq", (xy(z))%>
并以
结束hello("(abc d)", efg (hijk))
[space] (the actual character)
and
,
some more
<%lmn, "o()pq", (xy(z))%>
换句话说,我正在分裂(我希望这些包括在数组结果中)
[space]
,
....我有“平衡分组字符串”
" "
( )
<% %>
...我有转义字符
\
我不想为此目的编写一个完整的大解析器...
以下是代码:
public static IEnumerable<string> SplitNotEnclosed(this string s, IEnumerable<string> separators, Dictionary<string, string> enclosingValues = null, IEnumerable<char> escapeCharacters = null, bool includeSeparators = false, StringComparison comparisonType = StringComparison.Ordinal)
{
var results = new List<string>();
var enclosureStack = new Stack<KeyValuePair<string, string>>();
bool atEscapedCharacter = false;
if (escapeCharacters == null) escapeCharacters = new[] { '\\' };
if (enclosingValues == null) enclosingValues = new[] { "\"" }.ToDictionary(i => i);
var orderedEnclosingValues = enclosingValues.OrderByDescending(i => i.Key.Length).ToArray();
separators = separators.OrderByDescending(v => v.Length).ToArray();
var currentPart = new StringBuilder();
while (s.Length > 0)
{
int addToIndex = 0;
var newEnclosingValue = orderedEnclosingValues.FirstOrDefault(v => s.StartsWith(v.Key, comparisonType));
if (enclosureStack.Count > 0 && !atEscapedCharacter && s.StartsWith(enclosureStack.Peek().Value))
{
addToIndex = enclosureStack.Peek().Value.Length;
enclosureStack.Pop();
}
else if (newEnclosingValue.Key != null && !atEscapedCharacter)
{
enclosureStack.Push(newEnclosingValue);
addToIndex = newEnclosingValue.Key.Length;
}
else if (escapeCharacters.Contains(s[0]) && enclosureStack.Count > 0)
{
atEscapedCharacter = !atEscapedCharacter;
addToIndex = 1;
}
else if (enclosureStack.Count > 0)
{
atEscapedCharacter = false;
addToIndex = 1;
}
if (enclosureStack.Count == 0)
{
string separator = separators.FirstOrDefault(v => s.StartsWith(v, comparisonType));
if (separator != null)
{
if (currentPart.Length > 0) results.Add(currentPart.ToString());
results.Add(separator);
s = s.Substring(separator.Length);
currentPart = new StringBuilder();
addToIndex = 0;
}
else
{
addToIndex = 1;
}
}
currentPart.Append(s.Substring(0, addToIndex));
s = s.Substring(addToIndex);
}
if (currentPart.Length > 0) results.Add(currentPart.ToString());
if (!includeSeparators)
{
results = results.Except(separators).ToList();
}
return results.ToArray();
}