如何在Regex中使用Latin Extended Char

时间:2016-07-06 11:42:53

标签: c# regex special-characters

我有包含普通特殊字符和拉丁语扩展字符的字符列表。我想将这些特殊字符用作Regex。

Spcl char列表:

var listAdvSpclChar = File.ReadLines(_spclCharFilePath, Encoding.Default);

StringBuilder sb = new StringBuilder();
foreach (string s in listAdvSpclChar)
{
   sb.Append(s);
}
sb.ToString();

输出:

,.()-"*/#ÃŽ‚¦:'‚°?_+~& ¢¬³¹¼;\=%Æ’º¯…™£$‹“]¾Â`^¡Âµ[ž±<}¨!>¸¥Âœ²©·Â«®Ë„§¤¿Â­¶´†»{|

我想使用上面的spcl char,如下面的

Regex.IsMatch(textString, @"[^" + sb + "]";

我收到错误解析

"[,.()-"*/#ÃŽ‚¦:'‚°?_+~& ¢¬³¹¼ ;\=%Æ’º¯…™£$‹“]¾ `^¡ µ[ž±<}¨!>¸¥ œ²©· «®Ë„§¤¿ ­¶´†»{|]" 
- [x-y] range in reverse order.

如果我将\添加到每个字符,那么我将收到错误解析

"[,\.\(\)\-\"\*\/\#\Ã\ƒ\Å\½\‚\Â\¦\:\'\â\€\š\°\?\_\+\~\&\ \¢\¬\³\¹\¼\ \;\\\=\%\Æ\’\º\¯\…\™\£\$\‹\“\]\¾\ \`\^\¡\ \µ\[\ž\±\<\}\¨\!\>\¸\¥\ \œ\²\©\·\ \«\®\Ë\„\§\¤\¿\ \­\¶\´\†\»\{\|\]" 
- Unrecognized escape sequence \Ã.

我有如下字符串:

00000001,0020,0000000000Ø00027006,paper tape 19 28°,759,1648.000 ,1648.000 ,,06092014,12319999,000100022404,HALB,18.51 ,100 ,FS,PT-S12DS120-28,00166789,01,00000015,,00166789,M,01

00000001,0050,000000000000027006,paper tape 19 28°,759,2280.000 ,2280.000 ,,08262015,12319999,000100023811,HALB,18.51 ,100 ,FS,S75P306P-3M,00166882,01,00000021,,00166882,M,010

以上一行包含Ø,这在我的正则表达式列表中不可用,但我无法将该行作为错误行找到。

问题是我可以在正则表达式中使用上面的spcl char

1 个答案:

答案 0 :(得分:0)

您只需要转义一些字符,请参阅Metacharacters Inside Character Classes。您可以使用以下代码:

var listAdvSpclChar = File.ReadLines("Your Path", Encoding.Default);
List<string> toEscape = new List<string>()
{
    @"-", 
    @"\",
    @"]",
};
string escape = @"\";
StringBuilder sb = new StringBuilder();
foreach (string s in listAdvSpclChar)
{
    if (toEscape.Contains(s))
    {
        sb.Append(escape);
    }
    sb.Append(s);
}

// And then use it:
Regex.IsMatch("textString", string.Format("[^{0}]", sb));