C#将连接字符串从win1251编码到utf8并返回

时间:2017-03-19 07:26:21

标签: c# encoding utf-8 connection-string firebird

我正在尝试使用Firebird .net提供程序ver> = 5.6.0.0(当前为5.8.0.0)中的连接字符串编码来解决此问题。问题的完整描述(如果你感兴趣的话)是here,但我想我可以简单解释一下。让我们开始吧!我有一个系统默认编码win1251和一个包含调用“ DbPath ”的参数的连接字符串,其值为

   "F:\\Рабочая\\БД\\2.14.1\\January_2017\\MYDB.IB" 

当我将此连接字符串传递给firebird .net提供程序时,它从连接字符串中获取“DbPath”参数,并使用Encoding.UTF-8从其值中获取字节。这就是它们在代码中的样子:

protected virtual void SendAttachToBuffer(DatabaseParameterBuffer dpb, string database)
{
    XdrStream.Write(IscCodes.op_attach);
    XdrStream.Write(0);
    if (!string.IsNullOrEmpty(Password))
    {
      dpb.Append(IscCodes.isc_dpb_password, Password);
    }

    //database is DbPath
    XdrStream.WriteBuffer(Encoding.UTF8.GetBytes(database)); 

    XdrStream.WriteBuffer(dpb.ToArray());
}

如您所见,他们没有将编码从win1251转换为utf-8,他们只是使用Encoding.UTF8.GetBytes();

获取字节

稍后在他们的代码中我看到他们只是使用当前的编码(Encoding.Default)得到一个字符串:

public string GetString(byte[] buffer, int index, int count)
{
  //_encoding is Encoding.Default == win1251
  return _encoding.GetString(buffer, index, count);
}

这行代码的结果是我得到了一个I / O异常导致我的DbPath变为

"F:\\Рабочая\\БД\\2.14.1\\January_2017\\MYDB.IB" 

所以我尝试的第一件事就是将我的连接字符串转换为utf-8 使用这行代码:

 private static string Win1251ToUTF8(string source)
 {
   Encoding utf8 = Encoding.GetEncoding("utf-8");
   Encoding win1251 = Encoding.GetEncoding("windows-1251");
   byte[] win1251Bytes = win1251.GetBytes(source);
   byte[] utf8bytes = Encoding.Convert(win1251, utf8, win1251Bytes);
   source = utf8.GetString(utf8bytes);
   return source;
   //Actually I'm not sure that I'm converting Encoding correctly

 }

但它并没有影响。我用Encoding.Convert尝试了很多变种,但我还没有解决方案。有人能告诉我,我做错了什么以及如何解决问题。问候。

2 个答案:

答案 0 :(得分:2)

感谢您的downvote!

我回答的方式很好,也许你会误解如何处理这些工具......

我建议您尝试以下代码,也许它会对您有所帮助。创建一个新的C#WindowsFormApplication,放置一个BIG 多行 texBox" textBox1"和一个按钮" button1"在上面。在按钮单击处理程序中输入以下代码:

    // ----- The work -------------------------------------------------
    string source = "F:\\\\Рабочая\\\\БД\\\\2.14.1\\\\January_2017\\\\MYDB.IB";
    Encoding utf8 = Encoding.UTF8;
    Encoding unicode = Encoding.Unicode;
    Encoding win1251 = Encoding.GetEncoding("windows-1251");
    byte[] utf8Bytes = utf8.GetBytes(source);
    byte[] win1251Bytes = win1251.GetBytes(source);
    byte[] utf8ofwinBytes = Encoding.Convert(win1251, utf8, win1251Bytes);
    string unicodefromutf8 = utf8.GetString(utf8Bytes);
    string unicodefromwin1251 = win1251.GetString(win1251Bytes);




    // ----- The show -------------------------------------------------

    textBox1.Text = "";

    textBox1.Text += "Literal Unicode soource" + Environment.NewLine;
    textBox1.Text += source + Environment.NewLine + Environment.NewLine;

    string s1 = "";
    textBox1.Text += "UTF8" + Environment.NewLine;
    for (int i = 0; i < utf8Bytes.Length; i++)
    {
        s1 += utf8Bytes[i].ToString() + ", ";
    }
    textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;

    s1 = "";
    textBox1.Text += "WIN 1251" + Environment.NewLine;
    for (int i = 0; i < win1251Bytes.Length; i++)
    {
        s1 += win1251Bytes[i].ToString() + ", ";
    }
    textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;

    s1 = "";
    textBox1.Text += "UTF8 of WIN 1251" + Environment.NewLine;
    for (int i = 0; i < utf8ofwinBytes.Length; i++)
    {
        s1 += utf8ofwinBytes[i].ToString() + ", ";
    }
    textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;


    textBox1.Text += "Unicode string of UTF8 bytes" + Environment.NewLine;
    textBox1.Text += unicodefromutf8 + Environment.NewLine + Environment.NewLine;

    textBox1.Text += "Unicode string of WIN 1251 bytes" + Environment.NewLine;
    textBox1.Text += unicodefromwin1251 + Environment.NewLine + Environment.NewLine;

运行它,单击按钮,您将看到,所有转换,编码都按预期完成。

你问过一种方法将Unicode转换为UTF8到WIN1251到UTF8到UNICODE - 就在这里。发面。

你的误解可能是:

source = utf8.GetString(utf8bytes);
return source;

这会将创建的UTF8字节序列数组转换为Unicode字符串。所以返回一个Unicode字符串,而不是win-1251字符串的UTF8字节序列。确切地说,你返回的是你得到的相同字符串。

您必须将(正确的零终止)UTF8字节序列推送到.Net提供程序。

答案 1 :(得分:0)

使用Encoding.Convert转换字符集:

Encoding utf8 = Encoding.UTF8;
Encoding win = Encoding.GetEncoding("windows-1251");
byte[] winBytes = win.GetBytes(source);
byte[] utfBytes = Encoding.Convert(win, utf8, winBytes);
string result = utf8.GetString(utfBytes);