我正在尝试提取一个动态表格,其中包含阿根廷省选举区的选举结果。从此表中,我感兴趣的是检索选举巡回赛的名称('cmbCircuitos'),以及投票给各政党的票数[votos]
问题在于,即使代码“正确地”工作(运行时也没有错误),仍然存在某些电路,因此选举结果也导致代码无法检索。也就是说,由于无法提取第2区,因此该代码两次检索第1区。知道为什么会这样吗,我该如何解决?
代码如下:
driver = webdriver.Chrome('/Users/Administrador/Documents/chromedriver')
cir = []
votos = []
votos1 = []
def switch_to_top():
driver.switch_to.default_content()
driver.switch_to.frame("topFrame")
def switch_to_main():
driver.switch_to.default_content()
driver.switch_to.frame("mainFrame")
main_url = 'https://www.justiciacordoba.gob.ar/Estatico/JEL/Escrutinios/ReportesEleccion20190512/default.html'
driver.get(main_url)
switch_to_top()
dropdown_secciones = driver.find_element_by_id('cmbSecciones')
select_box_secciones = Select(dropdown_secciones)
options_secciones = select_box_secciones.options
mostrar_click = driver.find_element_by_id('cmdMostrar')
for index in range(1, len(options_secciones)):
if (index > 1):
switch_to_top()
select_box_secciones.select_by_index(index)
dropdown_circuitos = driver.find_element_by_id('cmbCircuitos')
select_box_circuitos = Select(dropdown_circuitos)
items_circuitos = select_box_circuitos.options
for i in range(1, len(items_circuitos)):
if (i > 1):
switch_to_top()
select_box_circuitos.select_by_index(i)
mostrar_click.click()
switch_to_main()
WebDriverWait(driver, 220).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "body>table")))
soup = BeautifulSoup(driver.page_source, "html.parser")
for td in soup.findAll('td',{'class':'c1'}):
circuitos = td.text
cir.append(circuitos)
for tr in soup.find('table').find_all('tr'):
row = tr.find_all(lambda td: td.has_attr('class'))
if (len(row) == 3) and (row[0].text != 'Nº'):
data = [td.text for td in row]
votos.append(data)
if (len(row) == 2) and (row[0].text != 'Nº'):
datos = [td.text for td in row]
votos1.append(datos)