序列化和反序列化char(s)

时间:2017-04-16 11:04:24

标签: c# json.net

我的班上有一系列字符。序列化和反序列化按预期工作。如果我的列表包含哪个char需要描述字节顺序标记。示例char代码是56256.因此,创建简单的测试,因为这个问题如下。

[Test]
public void Utf8CharSerializeAndDeserializeShouldEqual()
{
    UInt16 charCode = 56256;
    char utfChar = (char)charCode;
    using (MemoryStream ms = new MemoryStream())
    {
        using (StreamWriter writer = new StreamWriter(ms, Encoding.UTF8, 1024, true))
        {
            var serializer = new JsonSerializer();
            serializer.Serialize(writer, utfChar);
        }

        ms.Position = 0;
        using (StreamReader reader = new StreamReader(ms, true))
        {
            using (JsonTextReader jsonReader = new JsonTextReader(reader))
            { 
                var serializer = new JsonSerializer();
                char deserializedChar = serializer.Deserialize<char>(jsonReader);

                Console.WriteLine($"{(int)utfChar}, {(int)deserializedChar}");
                Assert.AreEqual(utfChar, deserializedChar);
                Assert.AreEqual((int)utfChar, (int)deserializedChar);
            }
        }
    }
}

当char代码不需要BOM时,测试工作正常。例如65(A)将通过此测试。

1 个答案:

答案 0 :(得分:1)

您的问题与Json.NET无关。您的问题是U+DBC0 (decimal 56256)是一个无效的unicode字符,并且如documentation中所述,Encoding.UTF8使用的StreamWriter将不会对此字符进行编码:

  

Encoding.UTF8返回一个UTF8Encoding对象,它使用替换回退来替换它不能编码的每个字符串以及不能用问号(“?”)字符解码的每个字节。

要确认这一点,如果您在测试示例中将Encoding.UTF8替换为new UTF8Encoding(true, true),则会出现以下异常:

EncoderFallbackException: Unable to translate Unicode character \uDBC0 at index 1 to specified code page. 

如果您要尝试序列化无效的Unicode char值,则需要使用以下方法手动将它们编码为例如字节数组:

public static partial class TextExtensions
{
    static void ToBytesWithoutEncoding(char c, out byte lower, out byte upper)
    {
        var u = (uint)c;
        lower = unchecked((byte)u);
        upper = unchecked((byte)(u >> 8));
    }

    public static byte[] ToByteArrayWithoutEncoding(this char c)
    {
        byte lower, upper;
        ToBytesWithoutEncoding(c, out lower, out upper);
        return new byte[] { lower, upper };
    }

    public static byte[] ToByteArrayWithoutEncoding(this ICollection<char> list)
    {
        if (list == null)
            return null;
        var bytes = new byte[checked(list.Count * 2)];
        int to = 0;
        foreach (var c in list)
        {
            ToBytesWithoutEncoding(c, out bytes[to], out bytes[to + 1]);
            to += 2;
        }
        return bytes;
    }

    public static char ToCharWithoutEncoding(this byte[] bytes)
    {
        return bytes.ToCharWithoutEncoding(0);
    }

    public static char ToCharWithoutEncoding(this byte[] bytes, int position)
    {
        if (bytes == null)
            return default(char);
        char c = default(char);
        if (position < bytes.Length)
            c += (char)bytes[position];
        if (position + 1 < bytes.Length)
            c += (char)((uint)bytes[position + 1] << 8);
        return c;
    }

    public static List<char> ToCharListWithoutEncoding(this byte[] bytes)
    {
        if (bytes == null)
            return null;
        var chars = new List<char>(bytes.Length / 2 + bytes.Length % 2);
        for (int from = 0; from < bytes.Length; from += 2)
        {
            chars.Add(bytes.ToCharWithoutEncoding(from));
        }
        return chars;
    }
}

然后按如下方式修改您的测试方法:

    public void Utf8JsonCharSerializeAndDeserializeShouldEqualFixed()
    {
        Utf8JsonCharSerializeAndDeserializeShouldEqualFixed((char)56256);
    }

    public void Utf8JsonCharSerializeAndDeserializeShouldEqualFixed(char utfChar)
    {
        byte[] data;

        using (MemoryStream ms = new MemoryStream())
        {
            using (StreamWriter writer = new StreamWriter(ms, new UTF8Encoding(true, true), 1024))
            {
                var serializer = new JsonSerializer();
                serializer.Serialize(writer, utfChar.ToByteArrayWithoutEncoding());
            }
            data = ms.ToArray();
        }

        using (MemoryStream ms = new MemoryStream(data))
        {
            using (StreamReader reader = new StreamReader(ms, true))
            {
                using (JsonTextReader jsonReader = new JsonTextReader(reader))
                {
                    var serializer = new JsonSerializer();
                    char deserializedChar = serializer.Deserialize<byte[]>(jsonReader).ToCharWithoutEncoding();

                    //Console.WriteLine(string.Format("{0}, {1}", utfChar, deserializedChar));
                    Assert.AreEqual(utfChar, deserializedChar);
                    Assert.AreEqual((int)utfChar, (int)deserializedChar);
                }
            }
        }
    }

或者,如果某个容器类中有List<char>属性,则可以创建以下转换器:

public class CharListConverter : JsonConverter
{
    public override bool CanConvert(Type objectType)
    {
        return objectType == typeof(List<char>);
    }

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        if (reader.TokenType == JsonToken.Null)
            return null;
        var bytes = serializer.Deserialize<byte[]>(reader);
        return bytes.ToCharListWithoutEncoding();
    }

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        var list = (ICollection<char>)value;
        var bytes = list.ToByteArrayWithoutEncoding();
        serializer.Serialize(writer, bytes);
    }
}

并按如下方式应用:

public class RootObject
{
    [JsonConverter(typeof(CharListConverter))]
    public List<char> Characters { get; set; }
}

在这两种情况下,Json.NET都会将字节数组编码为Base64。