我正试图从一个名为Correios的网站获取所有数据,在该网站中,我需要处理一些下拉菜单,但出现一些问题,例如: 它正在返回带有一串空字符串的列表。
chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = driver.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
我从中得到的是:
b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
如何删除空白的b“”?
答案 0 :(得分:0)
如果我正确理解,您想找到所有这些选项。
尝试使用此xPath来找到下拉列表元素:
chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")
# find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")
for valores in optEstados:
print(valores.text.encode())
代码示例:
b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'
通过此xPath,您将获得所有下拉列表元素,除了此下拉列表中的一个以外,没有空字符串。输出:
const provider = new firebase.auth.FacebookAuthProvider();
firebase.auth().signInWithPopup(provider).then((e) => {
console.log('token', e.credential.accessToken);
console.log('user', e.user);
}).catch(e => console.log);
注意:由于以下原因,第一个元素是空字符串:
答案 1 :(得分:0)
您的代码中需要进行少量更改:
dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = dropdownEstados.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
答案 2 :(得分:0)
要从 DropDown 的所有<options>
中检索文本,其中 id 为estadoAgencia
,因为它是<select>
标记,使用与<select>
标记相关联的方法将更加容易和高效,并且您可以使用以下解决方案:
代码块:
estado_select = Select(driver.find_element_by_id('estadoAgencia'))
for opt in estado_select.options:
print(opt.get_attribute('innerHTML'))
控制台输出:
ACRE
ALAGOAS
AMAPÁ
AMAZONAS
BAHIA
CEARÁ
DISTRITO FEDERAL
ESPÍRITO SANTO
GOIÁS
MARANHÃO
MINAS GERAIS
MATO GROSSO DO SUL
MATO GROSSO
PARÁ
PARAÍBA
PERNAMBUCO
PIAUÍ
PARANÁ
RIO DE JANEIRO
RIO GRANDE DO NORTE
RONDÔNIA
RORAIMA
RIO GRANDE DO SUL
SANTA CATARINA
SERGIPE
SÃO PAULO
TOCANTINS