Regex split on comma but ignoring when inside text qualifier containing two characters

时间:2015-06-30 19:20:37

标签: c# regex

I've seen a lot of examples of splitting on a comma and ignoring those commas which are inside single or double quotes. I am looking for a similar thing, however instead of being single or double quote I need the text qualifier to be ~*

I attempted to modify some of the code I found that used double quote as a text qualifier but was unsuccessful. I am terrible with regex and have spent sometime today looking at the documentation to understand it so I could try to create an expression that would work for my use.

Is this possible to have two characters as the text qualifier?

example of one of the lines:

~* header1~*, ~* header2 ~*, ~* header3, value1 ~*

I am looking for the output to be:

 ~* header1~*, 
 ~* header2~*, 
 ~* header3,value1~*

var result = Regex.Split(line, ",(?=(?:[^']*'[^']*')*[^']*$)");

3 个答案:

答案 0 :(得分:1)

分两行。

First replace all of the alone "~* " by using this expression "~\*\s", and replace it with a space, " ".(这摆脱了不是新行的〜*)

Then secondly split on "~\*,"

编辑:

您应该可以使用此表达式"(?<=(~\*,))\s"

进行拆分

答案 1 :(得分:1)

您可以使用一个正则表达式来实现所需的输出:

public static Decimal? GetNullibleDecimal(this SqlDataReader reader, int fieldIndex)
{
  object value = reader.GetValue(fieldIndex);
  return value is DBNULL ? null : (Decimal?)value;
}

每个捕获组将包含一个字符串。看一个有效的例子:https://www.regex101.com/r/zP3aM3/2

答案 2 :(得分:1)

不需要拆分。

string input = "~* header1~*, ~* header2 ~*, ~* header3, value1 ~*";
string pattern = @"~\* \s* .+? \s* ~\*";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnorePatternWhitespace);