如何将unicode UCS-2格式的文本转换为C#可读?

时间:2012-07-20 16:19:00

标签: c# unicode ucs2

我在SIM900 GPRS的短信中收到此消息。

  

07916698019021F00410D05479BDDC7CBBCB790008217002123430826A0049006E0063006F00720072006500630074002000700061007300730077006F00720064002E00200050006C050610306500065060740507202079060750702007001070730700F0700402001060010600E02

另一个示例消息:

  

07916698019021F00410D05479BDDC7CBBCB790008217002025501826A0049006E0063006F00720072006500630074002000700061007300730077006F00720064002E00200050006C06001073050200506E04065070200906F07072020700607307007060020600006007060090600E

我认为此邮件采用Unicode UCS-2格式,并且是泰语版。但是我无法将其转换为可读的东西。我发现这个非常有用的代码:

//Here's how you'd go from a string to stuff like
// U+0053 U+0063 U+006f
string scott = "ฉ";
foreach (char s in scott) {
  Console.Write("{0:x4} ", (int)s);
}
//Here's how converted a string (assuming it starts with U+)
// containing the representation of a char
// back to a char
// Is there a built in, or cleaner way? Would this work in Chinese?
string maybeC = "U+0063";
int p = int.Parse(maybeC.Substring(2),
 System.Globalization.NumberStyles.HexNumber);
Console.WriteLine((char)p);

提前致谢。

2 个答案:

答案 0 :(得分:4)

在Wikipedia上阅读我发现这个article,说UCS-2与UTF-16非常相似。所以:

string s = "07916698019021F00410D05479BDDC7CBBCB790008217002025501826A0049006E0063006F00720072006500630074002000700061007300730077006F00720064002E00200050006C06001073050200506E04065070200906F07072020700607307007060020600006007060090600E";
List<byte> bytes = new List<byte>();
for (int i = 0; i < s.Length; i+=2)
{
    bytes.Add(byte.Parse(s.Substring(i, 2), NumberStyles.HexNumber));
}

var str = Encoding.Unicode.GetString(bytes.ToArray());

输出:鄇顦送င哐뵹糜쮻y℈ɰ唂舁jIncorrect password. P٬ကճ湐؄灐ठ牰܂怀ݳ瀀ɠ怀؇退

答案 1 :(得分:0)

尝试使用内置的System.Text.Encoding类。

using System.Text;
// ..
var bytes = Encoding.GetEncoding("ucs-2").GetBytes("SomeString");

修改:您可以使用GetString(byte)转换UCS-2 / UTF-16编码。