如何替换非ascii西方拉丁字符,如'┌'├''⌐''┐'' ┴'在xml c#

时间:2016-06-07 06:02:04

标签: c# non-ascii-characters

如何删除非倒转的非ASCII字符''''''' ," L"等,在xml c#

我尝试过像Sanitize Xml String

(character >= 0x20 && character <= 0xD7FF) ||
(character >= 0xE000 && character <= 0xFFFD) ||
(character >= 0x10000 && character <= 0x10FFFF)

使用Regex如下:

Regex.Replace(inputText, @"[^><#\w\.@-]", "");
(or)
string str = str.replace(/[^A-Za-z 0-9 \.,\?""!@#\$%\^&\*\(\)-_=\+;:<>\/\\\|\}\{\[\]`~]*/g, '')

和Pattern替换如下:

string pattern = @"#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])";

最后用

XmlConvert.VerifyXmlChars(text);

但没有用,字符如下所示: &#39;┌&#39;&#39;├&#39;&#39;⌐&#39;&#39;┐&#39;&#39;┴&#39;

请看这个链接 https://en.wikipedia.org/wiki/Western_Latin_character_sets_%28computing%29

└U+ 2514 C0 C0
┘U + 2518 D9 D9

请帮帮我。提前致谢

1 个答案:

答案 0 :(得分:1)

Try This

string s = "søme string";
s = Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);