如何将包含jibrish / black钻石的字符串转换为希伯来语?

时间:2015-10-06 11:11:45

标签: c# .net winforms

这是我正在尝试的代码,但没有一个不是1255而不是862:

Encoding latinEncoding = Encoding.GetEncoding("Windows-1252");
Encoding hebrewEncoding = Encoding.GetEncoding(862);//"Windows-1255");
string name = anchor.InnerText;
byte[] latinBytes = latinEncoding.GetBytes(name);

string hebrewString = hebrewEncoding.GetString(latinBytes);

也许问题是它在源代码中没有拉丁语,我在变量名称中看到的是:����� �����而不是希伯来字母。

这是我正在使用的完整方法:

private void parseIds(string html)
{
    var htmlDoc = new HtmlAgilityPack.HtmlDocument();
    htmlDoc.LoadHtml(html);

    var anchor = htmlDoc.DocumentNode.Descendants("a").FirstOrDefault();

    if (anchor != null)
    {
        Encoding latinEncoding = Encoding.GetEncoding("Windows-1252");
        Encoding hebrewEncoding = Encoding.GetEncoding(862);//"Windows-1255");
        string name = anchor.InnerText;
        byte[] latinBytes = latinEncoding.GetBytes(name);

        string hebrewString = hebrewEncoding.GetString(latinBytes);

        string href = anchor.Attributes["href"].Value;

        Uri uri;

        if (Uri.TryCreate(href, UriKind.RelativeOrAbsolute, out uri))
        {
           if (!uri.IsAbsoluteUri)
              uri = new Uri(new Uri("http://www.google.com/"), uri);

           var queryKeyValues = System.Web.HttpUtility.ParseQueryString(uri.Query);
           string forumId = queryKeyValues["forumId"];
         }
     }
}

这就是我在构造函数中调用它的方式:

WebClient webclient = new WebClient();
webclient.DownloadFile("http://www.tapuz.co.il/forums/forumslistnew.asp", @"c:\testhtml\mainforums.html");
webclient.Dispose();

string[] lines = File.ReadAllLines(@"c:\testhtml\mainforums.html");

foreach(string line in lines)
{
    if (line.Contains("href") && line.Contains("forumId=") && !wholeids.Contains(line))
    {
        parseIds(line);                    
    }
}

我应该在哪里进行希伯来语的编码?

我试图使用:

webclient.Encoding = System.Text.Encoding.UTF8;

webclient.DownloadFile之前和此行之后一次,但它没有改变任何内容。

0 个答案:

没有答案