无法在运行scrapy spider

时间:2017-11-02 13:34:08

标签: c# python web-scraping scrapy scrapy-spider

我跟着this_link,我能够从我的c#代码中运行一个虚拟的python文件......

   public JsonResult FetchscrapyDataUrl(String website)
        {

           ProcessStartInfo start = new ProcessStartInfo();
            start.FileName = @"C:\ProgramData\Anaconda3\python.exe";            
            start.Arguments = @"C:\Users\PycharmProjects\scraping_web\scrape_info\main.py";
           //this is path to .py file from scrapy project

            start.CreateNoWindow = false;  // We don't need new window
            start.UseShellExecute = false;  // Do not use OS shell
            //start.RedirectStandardOutput = true;// Any output, generated by application will be redirected back
            start.RedirectStandardError = true; // Any error in standard output will be redirected back (for example exceptions)
            Console.WriteLine("Python Starting");


            start.RedirectStandardOutput = true;
            using (Process process = Process.Start(start))
            {
                using (StreamReader reader = process.StandardOutput)
                {
                    string stderr = process.StandardError.ReadToEnd(); // Here are the exceptions from our Python script
                    string result = reader.ReadToEnd();  // Here is the result of StdOut(for example: print "test")
                    Console.Write(result);
                }
            }

    }

现在我知道我可以从单个文件main.py运行scrapy蜘蛛......

from scrapy import cmdline    
cmdline.execute("scrapy crawl text".split())

当我从Windows中的cmd运行main.py文件时,它运行正常,但是当我从C#代码.Net框架运行它时它不起作用。错误是......

"Scrapy 1.4.0 - no active project\r\n\r\nUnknown command: crawl\r\n\r\nUse \"scrapy\" to see available commands\r\n"

任何想法如何运行... 或者我在Windows中错过了一些路径设置 ??

或者我应该以其他方式从C#运行我的蜘蛛?

1 个答案:

答案 0 :(得分:1)

您需要设置WorkingDirectory属性

start.WorkingDirectory = @"C:\Users\PycharmProjects\scraping_web\scrape_info\"

或者您需要cd到该目录才能使其正常工作