从控制台解码文本

时间:2016-11-23 09:21:16

标签: c# .net console decode

我尝试执行此代码:

private void Test(object sender, RoutedEventArgs e)
    {
        ProcessStartInfo start = new ProcessStartInfo("cmd",
    "/c \"wbadmin start recovert -version:02/26/2014-17:38 -itemtype:File - items:C:\test\"");

        int exitCode;
        using (Process proc = Process.Start(start))
        {
            proc.ErrorDataReceived += cmd_Error;
            proc.OutputDataReceived += cmd_DataReceived;
            proc.WaitForExit();

            exitCode = proc.ExitCode;
        }
    }

    private void cmd_DataReceived(object sender, DataReceivedEventArgs e)
    {
        if (e.Data == null) return;

        var source = Encoding.Unicode;
        var target = Encoding.UTF8;

        var sBytes = source.GetBytes(e.Data);
        var tBytes = Encoding.Convert(source, target, sBytes);

        var tString = Encoding.UTF8.GetString(tBytes);
        Console.WriteLine(tString);
    }

但我得到了这个字符串:“wbadmin 1.0 - ®≠·,‡”¨•≠,™Æ†≠§≠©,‡Æ™®†‡¢†Ê®® 我该如何解码这个字符串?

2 个答案:

答案 0 :(得分:2)

从cmd解析输出可能有点棘手,因为您的cmd有自己的代码页,通常等于系统的默认语言环境(您可以手动更改它,例如使用chcp命令)

阅读this了解详情。

重定向输出时,对我有用的方法(经过测试,同时使用wbadmin)如下:

  1. 获取系统的默认语言环境:

    [DllImport("kernel32.dll")]
    public static extern int GetSystemDefaultLCID();
    
    private static int GetCmdCodePage()
    {
        int lcid = GetSystemDefaultLCID();
        var ci = System.Globalization.CultureInfo.GetCultureInfo(lcid);
        return ci.TextInfo.OEMCodePage;
    }
    
  2. 获取相应的编码:

        Encoding enc = null;
        try
        {
            enc = Encoding.GetEncoding(GetCmdCodePage());
        }
        catch (Exception)
        {
            enc = Encoding.GetEncoding(855); // the value for Cyrillic
        }
    
  3. 设置流程的编码:

        if (!File.Exists(Path.Combine(Environment.SystemDirectory, @"wbadmin.exe")))
        {
            Console.WriteLine("wbadmin.exe not found");
            return;
        }
        Process pr = new Process();
        ProcessStartInfo psi = new ProcessStartInfo(@"wbadmin.exe");
        psi.WindowStyle = ProcessWindowStyle.Hidden;
        psi.CreateNoWindow = true;
        psi.UseShellExecute = false;
        psi.Arguments = "/?"; // prints avaliable commands
        psi.RedirectStandardOutput = true;
        psi.RedirectStandardError = true;
        psi.Verb = "runas";
        psi.StandardOutputEncoding = enc;
        psi.StandardErrorEncoding = enc;
        pr.StartInfo = psi;
        pr.Start();
    
        pr.WaitForExit(1000);
        string error = pr.StandardError.ReadToEnd();
    
        if (!string.IsNullOrEmpty(error))
        {
            Console.WriteLine("error: " + error);
            pr.Close();
            pr.Dispose();
            return;
        }
    
        string output = pr.StandardOutput.ReadToEnd();
    
        pr.Close();
        pr.Dispose();
    

答案 1 :(得分:0)

你的代码看起来完全正确,但毫无意义。事实是C#字符串总是UTF-16,无论如何。您的cmd_DataReceived方法将UTF-16转换为字节数组,其中包含原始字符串的UTF-8表示形式,然后通过调用Encoding.UTF8.GetString(tBytes)将其转换回UTF-16。

看起来外部程序以未知编码(UTF-8?)向控制台写入内容,但接收到的cmd_DataReceived已经解码为UTF-16。

我认为,如果您真的想将您的字符串从UTF-8转换为UTF-16,那么您的代码应该是

private void cmd_DataReceived(object sender, DataReceivedEventArgs e)
    {
        if (e.Data == null) return;

        var source = Encoding.Unicode;
        var target = Encoding.UTF8;

        var sBytes = source.GetBytes(e.Data);

        var tString = Encoding.UTF8.GetString(sBytes);
        Console.WriteLine(tString);
    }