我有一个字符串,需要格式化:
我想出了这个:
string Format( string str , string separator )
{
if( string.IsNullOrEmpty( str ) )
return string.Empty;
var words = new List<string>();
var sb = new StringBuilder();
foreach( var c in str.ToCharArray() )
{
if( char.IsLetterOrDigit( c ) )
{
sb.Append( c );
}
else if( sb.Length > 0 )
{
words.Add( sb.ToString() );
sb.Clear();
}
}
if( sb.Any() )
words.Add( sb.ToString() );
return string.Join( seperator , words );
}
是否有比这更好/更多linq的/更短/更高性能的解决方案(没有使用正则表达式)?
答案 0 :(得分:2)
您可以转到“低级别”并使用字符串为IEnumerable<char>
的事实来使用它GetEnumerator
string Format(string str, string separator)
{
var builder = new StringBuilder (str.Length);
using (var e = str.GetEnumerator ())
{
while (e.MoveNext ())
{
bool hasMoved = true;
if (!char.IsLetterOrDigit (e.Current))
{
while ((hasMoved = e.MoveNext ()) && !char.IsLetterOrDigit (e.Current))
;
builder.Append (separator);
}
if (hasMoved)
builder.Append (e.Current);
}
}
return builder.ToString ();
}
以防这是一个正则表达式版本
private static readonly Regex rgx = new Regex(@"[^\w-[_]]+", RegexOptions.Compiled);
string Format (string str, string separator)
{
return rgx.Replace (str, separator);
}
关于OP关于linq one-liner的评论的附录:
这是可能的,但很难“易于理解”
使用匿名类型
string Format (string str, string separator)
{
return str.Aggregate (new { builder = new StringBuilder (str.Length), prevDiscarded = false }, (state, ch) => char.IsLetterOrDigit (ch) ? new { builder = (state.prevDiscarded ? state.builder.Append (separator) : state.builder).Append (ch), prevDiscarded = false } : new { state.builder, prevDiscarded = true }, state => (state.prevDiscarded ? state.builder.Append (separator) : state.builder).ToString ());
}
使用元组代替
string Format (string str, string separator)
{
return str.Aggregate (Tuple.Create (new StringBuilder (str.Length), false), (state, ch) => char.IsLetterOrDigit (ch) ? Tuple.Create ((state.Item2 ? state.Item1.Append (separator) : state.Item1).Append (ch), false) : Tuple.Create (state.Item1, true), state => (state.Item2 ? state.Item1.Append (separator) : state.Item1).ToString ());
}
和Tuple一起,我们可以帮助他们“轻松”(可以说)可读性[虽然技术上不再是单行内容]
//top of file
using State = System.Tuple<System.Text.StringBuilder, bool>;
string Format (string str, string separator)
{
var initialState = Tuple.Create (new StringBuilder (str.Length), false);
Func<State, StringBuilder> addSeparatorIfPrevDiscarded = state => state.Item2 ? state.Item1.Append (separator) : state.Item1;
Func<State, char, State> aggregator = (state, ch) => char.IsLetterOrDigit (ch) ? Tuple.Create (addSeparatorIfPrevDiscarded (state).Append (ch), false) : Tuple.Create (state.Item1, true);
Func<State, string> resultSelector = state => addSeparatorIfPrevDiscarded (state).ToString ();
return str.Aggregate (initialState, aggregator, resultSelector);
}
让它变得复杂的是,当“项目输出”依赖于同一集合中的前一个(或下一个)项目时,(IMO)Linq *不太适合。 * Linq没有问题,但是很快就会出现很多噪音,包括Func和匿名类型/元组语法(可能C#7.0会稍微改变一下)
在相同的味道中,人们也可以接受只允许bool作为状态的副作用
string Format (string str, string separator)
{
var builder = new StringBuilder (str.Length);
Action<bool> addSeparatorIfPrevDiscarded = prevDiscarded => { if (prevDiscarded) builder.Append (separator); };
Func<bool, char, bool> aggregator = (prevDiscarded, ch) => {
if (char.IsLetterOrDigit (ch)) {
addSeparatorIfPrevDiscarded (prevDiscarded);
builder.Append (ch);
return false;
}
return true;
};
addSeparatorIfPrevDiscarded (str.Aggregate (false, aggregator));
return builder.ToString ();
}
答案 1 :(得分:1)
这样的内容可以避免使用List<string>
和使用string.Join
。它也会编译。
string Format(string str, char seperator)
{
if (string.IsNullOrEmpty(str))
return string.Empty;
var sb = new StringBuilder();
bool previousWasNonAlphaNum = false;
foreach (var c in str)
{
if (char.IsLetterOrDigit(c))
{
if (previousWasNonAlphaNum && sb.Count > 0)
sb.Append(seperator);
sb.Append(c);
}
previousWasNonAlphaNum = !char.IsLetterOrDigit(c);
}
return sb.ToString();
}
答案 2 :(得分:0)
试试这个,它会起作用
string Format(string str, string separator)
{
var delimiter = char.Parse(separator);
var replaced = false;
var cArray = str.Select(c =>
{
if (!char.IsLetterOrDigit(c) & !replaced)
{
replaced = true;
return delimiter;
}
else if (char.IsLetterOrDigit(c))
{
replaced = false;
}
else
{
return ' ';
}
return c;
}).ToArray();
return new string(cArray).Replace(" ","");
}
或者您可以尝试以下
string Format(string str, string separator)
{
var delimiter = char.Parse(separator);
var cArray = str.Select(c => !char.IsLetterOrDigit(c) ? delimiter : c).ToArray();
var wlist = new string(cArray).Split(new string[]{separator}, StringSplitOptions.RemoveEmptyEntries);
return string.Join(separator, wlist);
}