独立可执行输出在C#中为空

时间:2019-04-15 16:52:21

标签: c# python web-crawler

当我从命令行运行它时,Python制作的独立可执行文件(pwc.exe)始终将网站html数据输出到任何网站的控制台。

但是当我尝试将输出读取为c#字符串时,在大多数情况下(仅在非常小的网站上有效),我在c#中得到了一个空字符串。

  1. 在这种情况下,一切都很好

  2. 控制台输出正确,但c#字符串为空

pwc.exe代码:

from lxml import html import requests import sys url=sys.argv[1] host=sys.argv[2] headers = {'Host': host, 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0', 'Accept' : 'Accept: text/css,*/*;q=0.1', 'Accept-Language':'en-US,en;q=0.5', 'Accept-Encoding': 'gzip, deflate, br', 'Connection':'keep-alive'} r = requests.get(url, headers = headers) r.encoding = 'UTF-8' print (r.text)

c#代码:

        var proc = new Process
        {
            StartInfo = new ProcessStartInfo
            {
                FileName = AppDomain.CurrentDomain.BaseDirectory + @"pwc.exe",
                Arguments = "https://www.bbc.com/about-us www.bbc.com",
                UseShellExecute = false,
                RedirectStandardOutput = true,
                CreateNoWindow = true,
                WindowStyle = ProcessWindowStyle.Hidden
            }
        };

        proc.Start();
        string html = proc.StandardOutput.ReadToEnd();

我需要将pwc.exe控制台输出(utf8)转换为C#字符串。看起来,当我读取非常小的页面的输出时,在C#中一切正常。

p.s。试图像这样阅读,但没有帮助:

while (!proc.StandardOutput.EndOfStream)
{
html = proc.ou.ReadLine();
}

1 个答案:

答案 0 :(得分:0)

是因为这些例外。

enter image description here

您可以参考下面的代码来跟踪输出中的错误,可能是您必须从python端进行一些转换才能在C#代码中正确接收。

private static void ProcessItem()
    {
        var process = new Process
        {
            StartInfo = new ProcessStartInfo
            {
                FileName = AppDomain.CurrentDomain.BaseDirectory + @"dist\Webpy\webpy.exe",
                //Arguments = "https://gopro.com/about-us gopro.com",
                //Arguments = "https://www.google.com www.google.com",
                Arguments = "https://www.bbc.com/about-us www.bbc.com",
                UseShellExecute = false,
                RedirectStandardOutput = true,
                RedirectStandardError = true,
            }
        };
        //* Set your output and error (asynchronous) handlers
        process.OutputDataReceived += new DataReceivedEventHandler(OutputHandler);
        process.ErrorDataReceived += new DataReceivedEventHandler(OutputHandler);
        //* Start process and handlers
        process.Start();
        process.BeginOutputReadLine();
        process.BeginErrorReadLine();
        process.WaitForExit();
    }

    static void OutputHandler(object sendingProcess, DataReceivedEventArgs outLine)
    {
        //* Do your stuff with the output (write to console/log/StringBuilder)
        Console.WriteLine(outLine.Data);
    }