假设我有一个字节数组:
var myArr = new byte[] { 0x61, 0x62, 0xc4, 0x85, 0xc4, 0x87 };
所以它有6个元素,而它对应于utf8 abąć
,它有4个字母。通常你做
Encoding.UTF8.GetString(myArr);
将其转换为字符串。但是我们假设myArr
实际上更大(最后有更多的字节)但我知道(转换的先验)我只想要前4个字母。如何有效地将此数组转换为字符串?另外,最好让myArr
数组中的最后一个字节的索引(对应于转换后的字符串的结尾)。
示例:
// 3 more bytes at the end of formerly defined myArr
var myArr = new byte[] { 0x61, 0x62, 0xc4, 0x85, 0xc4, 0x87, 0x01, 0x02, 0x03 };
var str = MyConvert(myArr, 4); // read 4 utf8 letters
// str is "abąć"
// possibly I want to know that MyConvert stoped at the index 6 in myArr
生成的string str
对象应该有str.Length == 4
。
答案 0 :(得分:3)
Decoder
看起来像你的背,特别是有点巨大的Convert
方法。我想你想要:
var decoder = Encoding.UTF8.GetDecoder();
var chars = new char[4];
decoder.Convert(bytes, 0, bytes.Length, chars, 0, chars.Length,
true, out int bytesUsed, out int charsUsed, out bool completed);
使用您问题中的数据完成示例:
using System;
using System.Text;
public class Test
{
static void Main()
{
var bytes = new byte[] { 0x61, 0x62, 0xc4, 0x85, 0xc4, 0x87, 0x01, 0x02, 0x03 };
var decoder = Encoding.UTF8.GetDecoder();
var chars = new char[4];
decoder.Convert(bytes, 0, bytes.Length, chars, 0, chars.Length,
true, out int bytesUsed, out int charsUsed, out bool completed);
Console.WriteLine($"Completed: {completed}");
Console.WriteLine($"Bytes used: {bytesUsed}");
Console.WriteLine($"Chars used: {charsUsed}");
Console.WriteLine($"Text: {new string(chars, 0, charsUsed)}");
}
}