从HttpWebRequest返回字符串时,我收到了破坏我的回复(显示39;和uto;)的字符代码('和& quote;):
internal static void TranslateThis(Player player, string fromLang, string toLang, string text){
try
{
string translated = null;
HttpWebRequest hwr = (HttpWebRequest)HttpWebRequest.Create("http://translate.google.com/?langpair=" + fromLang + "|" + toLang + "&text=" + text.Replace(" ", "+") + "#");
HttpWebResponse res = (HttpWebResponse)hwr.GetResponse();
StreamReader sr = new StreamReader(res.GetResponseStream());
string html = sr.ReadToEnd();
int a = html.IndexOf("onmouseout=\"this.style.backgroundColor='#fff'\">") + 47;
int b = html.IndexOf("</span>",html.IndexOf("onmouseout=\"this.style.backgroundColor='#fff'\">") + 47);
translated = html.Substring(a, b - a);
if (translated.Length < (10 * text.Length)){
if (player == Player.Console)
{
player.ParseMessage(translated, true);
}
else
{
player.ParseMessage(translated, false);
}
} else {
player.Message("Usage: /translate [lang] [message]");
}
}
catch
{
player.Message("Usage: /translate [lang] [message]");
}
}
答案 0 :(得分:1)
首先确保您获得下载内容的正确编码。有关如何执行此操作的代码,请参阅此SO answer。
基本上检查编码的http标头和元标记,并在必要时重新编码内容。然后做一个HttpUtility.HtmlDecode来摆脱任何HTML编码字符。现在您已准备好开始搜索您要查找的任何内容。
我还建议使用Html Agility Pack之类的东西来解析html而不是indexof。
答案 1 :(得分:1)
很难说你的ParseMessage
方法到底有什么期望,所以这只是猜测:
您从Google翻译获得的结果是HTML格式。这意味着如果您想要纯文本输出,则必须将HTML转换为文本。您已经成功(至少现在,至少,直到谷歌翻译改变他们的输出页面一点点;您的解决方案不是完全傻瓜或面向未来)从HTML页面中提取翻译。但翻译仍然编码为HTML,您需要对其进行解码。为此,您可以使用WebUtility.HtmlDecode
方法(假设您使用的是.NET Framework 4):在
translated = html.Substring(a, b - a);
行,添加
translated = WebUtility.HtmlDecode(translated);
答案 2 :(得分:1)
与其他开发人员的讨论让我在最后一批评论之前尝试这一点。以下是最终的工作:
internal static void TranslateThis(Player player, string fromLang, string toLang, string text){
try
{
string translated = null;
text = Regex.Replace(text, @"[^\w\.\'\s@-]", "");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://translate.google.com/?langpair=" + fromLang + "|" + toLang + "&text=" + text.Replace(" ", "+") + "#");
request.MaximumAutomaticRedirections = 4;
request.MaximumResponseHeadersLength = 4;
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF7);
String html = readStream.ReadToEnd() + "";
int a = html.IndexOf("onmouseout=\"this.style.backgroundColor='#fff'\">") + 47;
int b = html.IndexOf("</span>",html.IndexOf("onmouseout=\"this.style.backgroundColor='#fff'\">") + 47);
translated = html.Substring(a, b - a);
response.Close();
readStream.Close();
if (translated.Length < (10 * text.Length))
{
translated = translated.Replace("'", "'");
translated = Regex.Replace(translated, @"[^\w\.\'\s@-]", "");
if (player == Player.Console)
{
player.ParseMessage(translated, true);
}
else
{
player.ParseMessage(translated, false);
}
}
else
{
player.Message("Usage: /translate [lang] [message]");
}
}
catch(Exception ex)
{
player.Message("Error:" + ex.ToString());
}
}