C#Web Browser Bot问题

时间:2011-03-11 02:59:53

标签: c# .net browser bots

我想做网络浏览器机器人。它应该点击链接并等待25秒。

    private void webBrowserMain_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) // This is only way It worked for me.
    {
        if (webBrowserMain.Url.AbsoluteUri == @"http://www.clix-cents.com/pages/clickads")
        {
            Regex regAddId = new Regex("onclick=\\'openad\\(\"([\\d\\w]+)\"\\);", RegexOptions.IgnoreCase); // Find link and click it.
            if (regAddId.IsMatch(webBrowserMain.DocumentText))
            {
                string AddId = regAddId.Match(webBrowserMain.DocumentText).Groups[1].ToString();
                webBrowserMain.Navigate(@"http://www.clix-cents.com/pages/clickads?h=" + AddId);
            }
        }
        else if (webBrowserMain.Url.AbsoluteUri.Contains("http://www.clix-cents.com/pages/clickads?h=")) // up to there everything is ok. But problem starts here.
        {
            Thread.Sleep(25000); // It pouses whole thread and browser, so timer in browser is not counting down.
            Regex regCaptchaCode = new Regex("src=\\'/pages/captcha\\?t=c&s=([\\d\\w\\W]+)\\'", RegexOptions.IgnoreCase);
            if (regCaptchaCode.IsMatch(webBrowserMain.DocumentText))
            {
                pictureBox1.ImageLocation = @"http://www.clix-cents.com/pages/captcha?t=c&s=" + regCaptchaCode.Match(webBrowserMain.DocumentText).ToString();
            }
        }
    }

如何为类似的东西编写机器人?我不知道。

2 个答案:

答案 0 :(得分:3)

不要重新发明轮子 - 那里已有解决方案,如WatiN,主要用于测试,但也适用于自动化。

WatiN页面的代码示例:

[Test]
public void SearchForWatiNOnGoogle()
{
  using (var browser = new IE("http://www.google.com"))
  {
    browser.TextField(Find.ByName("q")).TypeText("WatiN");
    browser.Button(Find.ByName("btnG")).Click();

    Assert.IsTrue(browser.ContainsText("WatiN"));
  }
}

答案 1 :(得分:1)

你可能会使用计时器。例如:

private Timer t = new Timer();
private string nextUrl = "";
private void buttonStart_Click(object sender, EventArgs e)
{
    t.Interval = 2500;
    t.Tick += new EventHandler(t_Tick);
}

void t_Tick(object sender, EventArgs e)
{
    if (!string.IsNullOrEmpty(nextUrl))
        webBrowser1.Navigate(nextUrl);
    else
    {
        Regex regCaptchaCode = new Regex("src=\\'/pages/captcha\\?t=c&s=([\\d\\w\\W]+)\\'", RegexOptions.IgnoreCase);
        if (regCaptchaCode.IsMatch(webBrowserMain.DocumentText))
        {
            pictureBox1.ImageLocation = @"http://www.clix-cents.com/pages/captcha?t=c&s=" + regCaptchaCode.Match(webBrowserMain.DocumentText).ToString();
        }
    }
}
private void webBrowserMain_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) // This is only way It worked for me.
{
    if (webBrowserMain.Url.AbsoluteUri == @"http://www.clix-cents.com/pages/clickads")
    {
        Regex regAddId = new Regex("onclick=\\'openad\\(\"([\\d\\w]+)\"\\);", RegexOptions.IgnoreCase); // Find link and click it.
        if (regAddId.IsMatch(webBrowserMain.DocumentText))
        {
            string AddId = regAddId.Match(webBrowserMain.DocumentText).Groups[1].ToString();
            nextUrl = @"http://www.clix-cents.com/pages/clickads?h=" + AddId;
            t.Start();
        }
    }
    else if (webBrowserMain.Url.AbsoluteUri.Contains("http://www.clix-cents.com/pages/clickads?h=")) // up to there everything is ok. But problem starts here.
    {
        nextUrl = "";
        t.Start();
    }
}

实际实施将取决于网站上的实际数据以及您希望如何使用它。如果所有链接都在一个页面上并且您想要打开每个链接,则可以解析所有链接并存储到列表中。然后启动计时器。在每个Tick中,您可以打开1个项目。