Question

据我所知，HTTP请求的默认编码是ISO 8859-1。

我是否可以使用Unicode来解码以字节数组形式提供的HTTP请求？

如果没有，我将如何在C＃中解码这样的请求？

编辑：我正在开发服务器，而不是客户端。

Answer 1

正如您所说，HTTP POST请求的默认编码是ISO-8859-1。否则，您必须查看可能看起来像Content-Type: application/x-www-form-urlencoded; charset=UTF-8的Content-Type标头。

将已发布的数据读入字节数组后，您可能决定将此缓冲区转换为字符串（请记住.NET中的所有字符串都是UTF-16）。只有在那个时刻你需要知道编码。

byte[] buffer = ReadFromRequestStream(...)
string data = Encoding
              .GetEncoding("DETECTED ENCODING OR ISO-8859-1")
              .GetString(buffer);

回答你的问题：

我可以使用Unicode来解码作为字节数组给出的HTTP请求？

是的，如果已使用unicode对此字节数组进行编码：

string data = Encoding.UTF8.GetString(buffer);

Answer 2

您不使用unicode编码来解码未使用unicode编码编码的内容，因为它无法正确解码所有字符。

为正确的编码创建Encoding对象并使用：

Encoding iso = Encoding.GetEncoding("iso-8859-1");
string request = iso.GetString(requestArray);

Answer 3

下面给出的代码应该有所帮助，如果你期望大量的数据流下来，那么异步执行它是最好的方法。

string myUrl = @"http://somedomain.com/file";
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(myUrl);

//Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4;
request.MaximumResponseHeadersLength = 4;
request.Timeout = 15000;

response = (HttpWebResponse)request.GetResponse();                              

Stream receiveStream = response.GetResponseStream();
Encoding encode = System.Text.Encoding.GetEncoding("utf-8");

StreamReader readStream = new StreamReader(receiveStream, encode);

Char[] read = new Char[512];

// Reads 512 characters at a time.
int count = readStream.Read(read, 0, 512);

while (count > 0)
{
  // Dumps the 512 characters on a string and displays the string.
  String str = new String(read, 0, count);
  count = readStream.Read(read, 0, 512);
}

Answer 4

每次 .NET在外部表示（例如TCP套接字）和内部Unicode格式（或其他方式）之间传输信息，某种形式的编码参与其中。

请参阅utf-8-vs-unicode，尤其是Jon Skeet's回答，并参考Joel的文章The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)。

我可以使用Unicode来解码HTTP请求吗？

4 个答案: