Question

我正在使用scrapySharp或HtmlAgilityPack在c＃中编写程序。但是，当我点击一个HTML元素（按钮，链接）时，我的缺点是我需要的那部分信息。

在某些论坛中，评论说使用Selenium时你可以操作html元素，所以我尝试了以下

    using OpenQA.Selenium;
    using OpenQA.Selenium.Chrome;

    // Defines the interface with the Chrome browser
    IWebDriver driver = new ChromeDriver ();
    // Auxiliary to store the label element in href
    Element IWebElement;
    // Go to the website
    driver.Url = url;

    // Click on the download button
    driver.FindElement (By.Id ("Download button")). Click ();

但是作为一个网络自动化测试，它打开一个浏览器和网站来执行选择过程（点击），所以它不适合我使用，因为我必须在内部的几个网站上进行检查。

虽然我可以继续使用Selenium，但我正在寻找避免使用浏览器的方法，而是在没有它的情况下点击。有没有人知道如何实现点击链接或按钮，而无需打开浏览器进行网页抓取？

Answer 1

希望这对有相同要求的人会有所帮助。
如果要避免打开浏览器，可以在ChromeDriver中使用以下设置。

// settings for avoid opening browser
var options = new ChromeOptions();
options.AddArgument("headless");
var service = ChromeDriverService.CreateDefaultService();
service.HideCommandPromptWindow = true;

// url to access and scrape
var url = "https://example.com";

using (var driver = new ChromeDriver(service, options))
{
    // access the url
    driver.Navigate().GoToUrl(url);

    // Click on the download button - copied from your code above
    driver.FindElement (By.Id ("Download button")). Click (); 
}

除了上面下面的链接之外，您可能还会发现有用的

can-selenium-webdriver-open-browser-windows-silently-in-background

running-webdriver-without-opening-actual-browser-window

使用Scrapy单击HTML元素（WebScraping）

1 个答案: