我试图在C#中拆分一个字符串。字符串如下所示:
string line = "red,\"\",blue,\"green\",\"blue,orange\",,\"black\",yellow";
结果应为:
string[] result = { "red", "", "blue", "green", "blue,orange", "", "black", "yellow" };
请注意,分隔符为","但在双引号内,它被忽略了。另请注意,并非分隔符之间的每个子字符串都用引号括起来。我想要一个答案,如果可能的话,分隔符是一个字符串。我不介意双引号是否包含在结果数组的元素中,例如:
string[] result = { "red", "\"\"", "blue", "\"green\"", "\"blue,orange\"", "", "\"black\"", "yellow" };
答案 0 :(得分:2)
这是一个双状态机器,它读取字符串中的每个字符,当遇到双引号时,它将进入一个状态,它会将每个后续字符视为value
的一部分,直到它遇到另一个字符双引号。当它处于正常状态时,它将从遇到的每个字符形成一个字符串,直到遇到逗号并将其添加到要返回的字符串列表中:
enum State {
InQuotes,
InValue
}
List<String> result = new List<String>();
using(TextReader rdr = new StringReader( line )) {
State state = State.InValue;
StringBuilder sb = new StringBuilder();
Int32 nc; Char c;
while( (nc = rdr.Read()) != -1 ) {
c = (Char)nc;
switch( state ) {
case State.InValue:
if( c == '"' ) {
state = State.InQuotes;
} else if( c == ',' ) {
result.Add( sb.ToString() );
sb.Length = 0;
} else {
sb.Append( c );
}
break;
case State.InQuotes:
if( c == '"' ) {
state = State.InValue;
} else {
sb.Append( c );
}
break;
} // switch
} // while
if( sb.Length > 0 ) result.Add( sb.ToString() );
} // using