System.Text.Encoding
使用了哪个CharSet.Ansi
?
我想解码.NET Core应用程序中的字符串(之前由C ++代码编组),而不定义结构并使用Marshal.PtrToStructure
。
Encoding.GetEncoding(???).GetString(...)
在.NET Framework应用程序System.Text.Encoding.Default
中起作用:
[StructLayout(LayoutKind.Sequential, Pack = 1, CharSet = CharSet.Ansi)]
public struct Structure
{
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 5)]
public string FieldA;
}
public class Net461App
{
static void Main(string[] args)
{
var @struct = new Structure { FieldA = "äöüß" };
byte[] buffer = ToByteArray(@struct);
var unmarshalled = ToStructure<Structure>(buffer).FieldA; // "äöüß"
Console.WriteLine(Encoding.Default.GetString(buffer).Trim('\0')); // "äöüß"
Console.WriteLine(Encoding.Default.EncodingName); // Western European (Windows)
Console.WriteLine(Encoding.Default.CodePage); // 1252
int ansiCodePage = Thread.CurrentThread.CurrentCulture.TextInfo.ANSICodePage; // 1252
Encoding ansiEncoding = Encoding.GetEncoding(ansiCodePage); // works
Console.WriteLine(ansiEncoding.GetString(buffer).Trim('\0')); // "äöüß"
Console.WriteLine(ansiEncoding.EncodingName); // Western European (Windows)
Console.WriteLine(ansiEncoding.CodePage); // 1252
}
public static byte[] ToByteArray<T>(T structure) where T : struct
{
var buffer = new byte[Marshal.SizeOf(structure)];
IntPtr handle = Marshal.AllocHGlobal(buffer.Length);
try
{
Marshal.StructureToPtr(structure, handle, true);
Marshal.Copy(handle, buffer, 0, buffer.Length);
return buffer;
}
finally
{
Marshal.FreeHGlobal(handle);
}
}
public static T ToStructure<T>(byte[] buffer) where T : struct
{
IntPtr handle = Marshal.AllocHGlobal(buffer.Length);
try
{
Marshal.Copy(buffer, 0, handle, buffer.Length);
return Marshal.PtrToStructure<T>(handle);
}
finally
{
Marshal.FreeHGlobal(handle);
}
}
}
Encoding.Default
与Thread.CurrentThread.CurrentCulture.TextInfo.ANSICodePage
相同,它们都生成与CharSet.Ansi
相同的字符串。但TextInfo.ANSICodePage
始终与CharSet.Ansi
相同吗?
Encoding.Default
在.NET Core中有所不同,并且不支持代码页1252:
public class NetCore2App
{
static void Main(string[] args)
{
var @struct = new Structure { FieldA = "äöüß" };
byte[] buffer = ToByteArray(@struct);
var unmarshalled = ToStructure<Structure>(buffer).FieldA; // "äöüß"
Console.WriteLine(Encoding.Default.GetString(buffer).Trim('\0')); // "????"
Console.WriteLine(Encoding.Default.EncodingName); // Unicode(UTF - 8)
Console.WriteLine(Encoding.Default.CodePage); // 65001
int ansiCodePage = Thread.CurrentThread.CurrentCulture.TextInfo.ANSICodePage; // 1252
Encoding ansiEncoding = Encoding.GetEncoding(ansiCodePage); // throws "No data is available for encoding 1252."
Console.WriteLine(ansiEncoding.GetString(buffer).Trim('\0')); // ...
Console.WriteLine(ansiEncoding.EncodingName); // ...
Console.WriteLine(ansiEncoding.CodePage); // ...
}
// ...
}
更新
调查使用System.Text.Encoding.CodePages的建议我发现以下有关获取系统当前ANSI代码页的提示:
Encoding.GetEncoding
以下似乎有效:
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
int currentAnsiCodePage = Encoding.GetEncoding(0).CodePage;
Encoding encoding = Encoding.GetEncoding(currentAnsiCodePage);
以下各项为我提供了相同的代码页,可以正确解码我的测试中的字符串
CodePagesEncodingProvider
注册)不确定哪一个更适合并适用于所有机器。