我想从this webpage的嵌套表tr
中检索#timeTable
。
我尝试了以下操作,但是给出了一个空数组。
nlg_timetable_url = "https://navlib.forth-crs.gr/italian_b2c/npgres.exe?func=TT&ReservationType=npgres.exe%3FPM%3DBO&Leg1i=PRJ&Leg1ii=BEV&Leg1Date=26%2F02%2F2019&TotalPassengers=1&TotalPassengersHuman=1&TotalPassengersAcce=0&TotalVehicles=0"
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
request = urllib.request.Request(nlg_timetable_url,headers=headers)
html = urllib.request.urlopen(request).read()
soup = BeautifulSoup(html,'html.parser')
ngl_timetable_table = list(soup.select('#timeTable tr'))
print(ngl_timetable_table)
输出
[]
答案 0 :(得分:2)
我会使用请求模块
import requests
from bs4 import BeautifulSoup
nlg_timetable_url = "https://navlib.forth-crs.gr/italian_b2c/npgres.exe?func=TT&ReservationType=npgres.exe%3FPM%3DBO&Leg1i=PRJ&Leg1ii=BEV&Leg1Date=26%2F02%2F2019&TotalPassengers=1&TotalPassengersHuman=1&TotalPassengersAcce=0&TotalVehicles=0"
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
res = requests.get(nlg_timetable_url,headers=headers)
soup = BeautifulSoup(res.content,'html.parser')
for item in soup.select('#timeTable tr'):
print(item.text)