我只是在尝试使用Selenium Webdriver,我遇到了一些问题,所有这些问题似乎都需要我解决,然后再继续。当我运行粘贴在底部的代码时,Selenium永远需要用浏览器执行任何操作,并且显示“等待[在此处插入站点]”一段时间。完成这些操作后,它最终将找到元素并与页面进行交互。有时我会遇到各种各样的错误,并且这些错误并不一致(但是这些错误不会破坏代码,直到Selenium遵照我的要求进行处理才花更长的时间)。有时我的防病毒软件会弹出来,说它阻止了网站(global.ymtracking.com),因为它是恶意软件。我觉得这可能与网页上的广告以及Selenium如何加载网页有关,但我真的不知道,因为我对此并不陌生。我从访问此网站就从未遇到过任何问题。
using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
namespace WebScrappingTesting
{
class Program
{
static IWebDriver driver;
static Dictionary<string, Dictionary<string, float>> hockeyStats = new Dictionary<string, Dictionary<string, float>>();
static string TeamName ="";
static float TeamAge;
static float Wins;
static float Losses;
static void Main(string[] args)
{
driver = new ChromeDriver();
driver.Navigate().GoToUrl("https://www.hockey-reference.com/leagues/NHL_2019.html");
using (StreamWriter file = new StreamWriter(@"C:\Users\hockeyData.csv", false))
{
file.WriteLine("Team Name,Avg Age,Wins,Losses");
}
for (var year = 2019; year > 2015; year--)
{
Console.WriteLine($"Getting data for {year}");
var rowsOfData = driver.FindElements(By.CssSelector("#stats > tbody > tr"));
for (var i = 1; i <= rowsOfData.Count; i++)
{
TeamName = driver.FindElement(By.CssSelector($"#stats > tbody > tr:nth-child({i}) > td.left > a")).Text;
float.TryParse(driver.FindElement(By.CssSelector($"#stats > tbody > tr:nth-child({i}) > td:nth-child(3)")).Text, out TeamAge);
float.TryParse(driver.FindElement(By.CssSelector($"#stats > tbody > tr:nth-child({i}) > td:nth-child(5)")).Text, out Wins);
float.TryParse(driver.FindElement(By.CssSelector($"#stats > tbody > tr:nth-child({i}) > td:nth-child(6)")).Text, out Losses);
// TODO: need to change this to a 3 dim dictionary or an object to be able to also record the year of the data
hockeyStats[TeamName] = new Dictionary<string, float>() {
{ "Age", TeamAge },
{ "Wins", Wins },
{ "Losses", Losses }
};
}
foreach (var item in hockeyStats)
{
Console.WriteLine("=============================================================");
Console.WriteLine($"Team Name: {item.Key}, Avg Age: {item.Value["Age"]}");
Console.WriteLine($"Wins {item.Value["Wins"]}, Losses {item.Value["Losses"]}");
}
using (StreamWriter file = new StreamWriter(@"C:\Users\hockeyData.csv", true))
{
foreach (var item in hockeyStats)
{
file.WriteLine($"{item.Key},{item.Value["Age"]},{item.Value["Wins"]},{item.Value["Losses"]}");
}
}
driver.FindElement(By.LinkText("Previous Season")).Click();
//driver.FindElement(By.CssSelector($"[href*=/leagues/NHL_{year - 1}.html]"));
}
Console.WriteLine("=============================================================");
Console.WriteLine("=============================================================");
Console.WriteLine(@"You file has been saved to C:\Users\hockeyData.csv");
Console.WriteLine("=============================================================");
Console.WriteLine("=============================================================");
driver.Close();
driver.Dispose();
}
}
}
答案 0 :(得分:0)
它似乎与网站有关。我运行了提琴手,该网站的后台发生了很多事情。