我导入的文本包含多个Unicode转义序列
\ u0092 \ u0093 \ u0094 \ u0095 \ u0096
示例文字:
string str = " Canadian Equity Funds may also invest in or use derivative instruments as described in “Investment Strategies – Use of Derivative Instruments" ";
示例c#text:
may also invest in or use derivative instruments as described in \u0093Investment Strategies \u0096 Use of Derivative Instruments\u0094."
我尝试使用此
Regex rx = new Regex(@"\\[uU]([0-9A-F]{4})");
var newString = rx.Replace(input, match =>
((char)Int32.Parse(match.Value.Substring(2), NumberStyles.HexNumber)).ToString());
我得到完全相同的字符串。
我尝试了将这些转换为实际文本的所有方法,但无济于事。我该怎么办呢?