将unicode转义序列转换为字符串

时间:2017-03-31 14:29:48

标签: c# parsing text unicode encoding

我导入的文本包含多个Unicode转义序列

\ u0092 \ u0093 \ u0094 \ u0095 \ u0096

示例文字:

string str = " Canadian Equity Funds may also invest in or use derivative instruments as described in “Investment Strategies – Use of Derivative Instruments" ";

示例c#text:

 may also invest in or use derivative instruments as described in \u0093Investment Strategies \u0096 Use of Derivative Instruments\u0094."

我尝试使用此

Regex rx = new Regex(@"\\[uU]([0-9A-F]{4})");
var newString = rx.Replace(input, match => 
((char)Int32.Parse(match.Value.Substring(2), NumberStyles.HexNumber)).ToString());

我得到完全相同的字符串。

我尝试了将这些转换为实际文本的所有方法,但无济于事。我该怎么办呢?

0 个答案:

没有答案