我正在尝试将像这样的非Unicode字符串'¹ûº¤¢¤¤ì©2'转换为Unicode,'ໃຊ້ໃນຄົວເຮືອນ'老挝。我尝试使用下面的代码,它的返回值是这样的,'??????'。知道如何转换字符串?
Public Shared Function ConvertAsciiToUnicode(asciiString As String) As String
' Create two different encodings.
Dim encAscii As Encoding = Encoding.ASCII
Dim encUnicode As Encoding = Encoding.Unicode
' Convert the string into a byte[].
Dim asciiBytes As Byte() = encAscii.GetBytes(asciiString)
' Perform the conversion from one encoding to the other.
Dim unicodeBytes As Byte() = Encoding.Convert(encAscii, encUnicode, asciiBytes)
' Convert the new byte[] into a char[] and then into a string.
' This is a slightly different approach to converting to illustrate
' the use of GetCharCount/GetChars.
Dim unicodeChars As Char() = New Char(encUnicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length) - 1) {}
encUnicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0)
Dim unicodeString As New String(unicodeChars)
' Return the new unicode string
Return unicodeString
End Function
答案 0 :(得分:4)
您的8位编码老挝文本不是ASCII格式,而是在某些代码页中,如IBM CP1133或Microsoft LC0454,或者很可能是泰语代码页874.您必须找出它是哪一个。
重要的是你如何获得(读取,接收,计算)输入字符串。当你把它作为字符串时,它已经是Unicode并且很容易以UTF-8输出,例如,像这样:
Dim writer As New StreamWriter("myfile.txt", True, System.Text.Encoding.UTF8)
writer.Write(mystring)
writer.Close()
以下是整个内存转换:
Dim utf8_input as Byte()
...
Dim converted as Byte() = Encoding.Convert(Encoding.GetEncoding(874), Encoding.UTF8, utf8_input)
数字874
是您输入的代码页中的数字。特定操作系统安装是否支持此代码页是另一个问题,但如果您只是用它来编写Stack Overflow问题,那么您自己的系统几乎肯定会支持它。