Question

我正在尝试从网站“ https://schedule.msu.edu/”中抓取课程数据。选择术语，主题并单击“查找课程”按钮后，将显示一个课程列表，然后单击每个课程，例如：（术语AAAS 100，该课程为：秋季，2019年，主题为：African American＆African Studies），出现一个弹出窗口，我尝试使用selenium从弹出窗口获取数据，它抛出异常，提示“ NoSuchElementException：消息：没有这样的元素：无法找到元素：”。弹出窗口打开后，其URL将不同，但是我无法弄清楚如何从弹出窗口获取数据。在此问题上的任何帮助，我将不胜感激。

以下是使用硒的示例代码：

struct my_msgbuf buf;

if((pid = fork())<0)
{
    /* @TODO set the buf.mtype here  */
    perror("fork");
    for(;;) /* use loop inside parent process to write into MQ continuously */
    {
        /* @TODO scan the data into  buf.mtext */
        /* @TODO msgsnd statement */
    }
}
else
{
    for(;;) /* use loop inside child process to read from MQ continuously */
    {
        if (msgrcv (msqid, &buf, sizeof (buf), 1, 0) == -1)
            perror ("msgrcv");
        printf("received data : %s\n", buf.mtext);
    }
}

Answer 1

那些元素都包裹在一个框架中，因此您应该首先切换到该框架。该代码应该可以工作：

#Clicking on a course to get the popup
course=driver.find_element_by_xpath("//*[@id='MainContent_divHeader1_va']/h3[1]/a").click()

#Trying to Scrape from the popup
#pop_up=driver.find_element_by_xpath("//*[@id='RepeaterMain']/tbody/tr[1]/td/h3")
time.sleep(5)
driver.switch_to.frame(driver.find_element_by_xpath("//*[@id='CourseFrame']"))
#print(driver.page_source)
time.sleep(5)
try:            
    pop_up=WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.XPATH, "//*[@id='RepeaterMain']/tbody/tr[1]/td/h3")))
    print(pop_up.text)
except NoSuchElementException:
    pass

输出：

AAAS 390  Special Topics in Black/Africana Studies

Answer 2

弹出模式内部有一个框架。您必须先切换到该框架，然后才能访问框架中的元素。

尝试一下：

#Trying to Scrape from the popup
driver.switch_to.frame(driver.find_element_by_id("CourseFrame"))
pop_up=driver.find_element_by_xpath("//*[@id='RepeaterMain']")
print(pop_up.text)

使用python硒从弹出窗口进行Web抓取

2 个答案: