这是我到目前为止所做的(这不起作用)。在这一点上,我认为我的目标是Ansi编码,但我真的不想在此时知道。我的浏览器似乎能够确定要使用的编码,我该怎么办?
static void GetUrl(Uri uri, string localFileName)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response;
response = (HttpWebResponse)request.GetResponse();
// Save the stream to file
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream, Encoding.Default);
Stream fileStream = File.OpenWrite(localFileName);
using (StreamWriter sw = new StreamWriter(fileStream, Encoding.Default))
{
sw.Write(reader.ReadToEnd());
sw.Flush();
sw.Close();
}
}
答案后(目前仅在UTF-8网站上测试):
static void GetUrl(Uri uri, string localFileName)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
try
{
// Hope GetEncoding() knows how to parse the CharacterSet
Encoding encoding = Encoding.GetEncoding(response.CharacterSet);
StreamReader reader = new StreamReader(response.GetResponseStream(), encoding);
using (StreamWriter sw = new StreamWriter(localFileName, false, encoding))
{
sw.Write(reader.ReadToEnd());
sw.Flush();
sw.Close();
}
}
finally
{
response.Close();
}
}
答案 0 :(得分:3)
网络浏览器尝试检测字符编码的方式有三种。
查找(如果是HTML):
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
或(对于XHTML)
<?xml version="1.0" encoding="ISO-8859-1"?>
或有时它甚至在http header
中指定Content-Type: text/html; charset=ISO-8859-1
答案 1 :(得分:2)
您应该查找服务器发送响应的编码。Encoding.Default
此处不会切断芥末。 : - )
Stream responseStream = response.GetResponseStream();
Encoding enc = Encoding.GetEncoding(response.CharacterSet);
StreamReader reader = new StreamReader(responseStream, enc);
Stream fileStream = File.OpenWrite(localFileName);
using (StreamWriter sw = new StreamWriter(fileStream, enc))
{ /* ... */ }
可以肯定的是,您可以将所有内容转换为UTF-8并始终将文件存储为UTF-8。这样,在阅读文件时,您永远不需要猜测编码。