当表单只有属性即没有名称或ID时如何在scrapysharp中查找表单

时间:2019-11-30 03:47:59

标签: c# web-scraping scrapysharp

我是scrapySharp和Web抓取的新手。我正在尝试抓取一个安全的网站并具有登录屏幕的网站。 form元素没有name / id属性,因此使我的生活更加复杂。我一直无法弄清楚如何使用下面的代码加载表单。任何见解都将不胜感激!

C#:

using System;

namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {   

            int num1 = int.Parse(args[0]);
            int num2 = int.Parse(args[1]);            
            bool GameOver = false;
            int turn = 3;
            Random random = new Random();
            int answer = random.Next(num1, num2);        
            // string input = "";

            Console.WriteLine("Hello, welcome to the guess a number challenge");

            while (!GameOver)
            {
                if (turn != 0)
                {                    
                    turn--;
                    Console.WriteLine($"Please Select number between {num1} to {num2}:");                    
                    int SelectedNumber = int.Parse(Console.ReadLine());
                    if (SelectedNumber < answer && SelectedNumber >= num1)
                    {
                        System.Console.WriteLine("Almost there, just the number is too small\n");
                    } else if (SelectedNumber > answer && SelectedNumber <= num2)
                    {
                        System.Console.WriteLine("Your number is too big\n");
                    } else if(SelectedNumber == answer)
                    {
                        System.Console.WriteLine("CONGRATULATIONS!!!! You guess it right\n");
                        GameOver = true;
                        retry();
                    } else
                    {
                        System.Console.WriteLine("Your number is out of range\n");
                    }
                } else
                {
                    System.Console.WriteLine($"GAME OVER!!!! The answer is {answer}");
                    GameOver = true;
                    retry();
                }

                void retry() {
                    System.Console.WriteLine("Would you like to retry? Y/N");
                    string input = Console.ReadLine();
                    string ConsoleInput = input.ToLower();
                    if(ConsoleInput == "y")
                    {
                        GameOver = false;
                        turn = 3;
                    } else if(ConsoleInput == "n")
                    {
                        GameOver = true;
                    } else
                    {
                        Console.WriteLine("Invalid input");
                        retry();
                    }
                }
            }
        }
    }
}

HTML:

ScrapingBrowser browser = new ScrapingBrowser();
var homepage = browser.NavigateToPage(new Uri("https://somedomain.com/ProviderLogin.action/"));
var form1 = homepage.Find("form", ScrapySharp.Html.By.Text("form"));
var form2 = homepage.FindFormById("form[action='provider-login']");

2 个答案:

答案 0 :(得分:0)

您无法在ScrapySharp中使用“按”来实现,因为它只有四种“元素搜索类型”:

{
   Text,
   Id,
   Name,
   Class
}

在您的情况下,您没有其中之一,因此请考虑使用“ CssSelect”来实现您的目的:

var form = homepage.Html.CssSelect("form[action='provider-login']");
//Or
var form = homepage.Html.CssSelect("form[action*='provider-login']");

答案 1 :(得分:0)

您可以按标签找到第一个表单节点,然后使用 PageWebForm构造函数

var browser = new ScrapingBrowser();
var homepage = browser.NavigateToPage(new Uri("https://somedomain.com/ProviderLogin.action/"));

var form1node = homepage.Html.SelectSingleNode("//form");
var form1 = new PageWebForm(form1node, browser); // this is where it happens!

form1["username"] = "some username";
form1["password"] = "some password";
form1.Method = HttpVerb.Post;
var webpage = form1.Submit();