我将收到的GZIPped数据解压缩为字符串。当BUFFER_SIZE为512时,它会在缓冲区限制点处破坏unicode字符时出现问题。结果我得到带问号的文字。它发生在非拉丁字母上。
...во и ��ргуме...
public static String decompress(byte[] compressed) throws IOException {
final int BUFFER_SIZE = 512;
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder string = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1) {
string.append(new String(data, 0, bytesRead));
}
gis.close();
is.close();
return string.toString();
}
答案 0 :(得分:4)
错误在算法中,假设正在读取的块在UTF-8字节序列边界上结束(并开始)。
所以这样做:
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((bytesRead = gis.read(data)) != -1) {
baos.write(data, 0, bytesRead);
}
gis.close();
is.close();
return baos.toString("UTF-8");
答案 1 :(得分:2)
您可以将// In MVC Filter
HttpCookie cookie = filterContext.HttpContext.Request.Cookies.Get("AppSettings");
// Otherwise
HttpCookie cookie = Request.Cookies.Get("AppSettings");
String value = cookie.Values["key"];
包装成GZIPInputStream
并读取字符而不是字节。通过这样做,您不会遇到缓冲区边界可能无效编码的问题。