https://www.ptv.vic.gov.au/next5/diva/10018306/line/9777/2
我正在尝试获取时间/时间(出发时间)和目的地,但是页面每60秒刷新一次,而我无法获取该信息。
这是我到目前为止尝试过的:
from bs4 import BeautifulSoup
import requests
from user_agent import generate_user_agent
from requests import get
headers = {'User-Agent': generate_user_agent(device_type="desktop", os=('mac', 'linux'))}
url = 'https://www.ptv.vic.gov.au/next5/diva/10004556/line/11613/2'
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
type(html_soup)
datatest = html_soup.find_all('div', class_='timetable')
print(type(datatest))
print(len(datatest))
我想从网站上获取至少3个即将到来的时间和目的地。
答案 0 :(得分:1)
使用JSON请求每分钟更新一次实时数据。从JSON数据中提取此信息比尝试从呈现的HTML中将其抓取要容易得多。例如:
from datetime import datetime
import requests
r = requests.get("https://www.ptv.vic.gov.au/langsing/stop-services?stopId=10018306&direction=Altona&limit=20&mode=2")
json_reply = r.json()
for value in json_reply['values']:
dt_departing = datetime.strptime(value['time_timetable_utc'], '%Y-%m-%dT%H:%M:%SZ')
departing = dt_departing.strftime("%I:%M%p") # 12hour format
line_name = value['platform']['direction']['line']['line_name']
print(f'{departing} - {line_name}')
将为您提供数据起点:
05:57PM - 903 - Altona - Mordialloc (SMARTBUS Service)
06:14PM - 903 - Altona - Mordialloc (SMARTBUS Service)
06:31PM - 903 - Altona - Mordialloc (SMARTBUS Service)
06:41PM - 903 - Altona - Mordialloc (SMARTBUS Service)
06:57PM - 903 - Altona - Mordialloc (SMARTBUS Service)
07:09PM - 903 - Altona - Mordialloc (SMARTBUS Service)
07:20PM - 903 - Altona - Mordialloc (SMARTBUS Service)
07:30PM - 903 - Altona - Mordialloc (SMARTBUS Service)
07:42PM - 903 - Altona - Mordialloc (SMARTBUS Service)
07:51PM - 903 - Altona - Mordialloc (SMARTBUS Service)
08:06PM - 903 - Altona - Mordialloc (SMARTBUS Service)
08:20PM - 903 - Altona - Mordialloc (SMARTBUS Service)
08:32PM - 903 - Altona - Mordialloc (SMARTBUS Service)
08:44PM - 903 - Altona - Mordialloc (SMARTBUS Service)
08:59PM - 903 - Altona - Mordialloc (SMARTBUS Service)
09:14PM - 903 - Altona - Mordialloc (SMARTBUS Service)
09:30PM - 903 - Altona - Mordialloc (SMARTBUS Service)
09:45PM - 903 - Altona - Mordialloc (SMARTBUS Service)
10:00PM - 903 - Altona - Mordialloc (SMARTBUS Service)
10:15PM - 903 - Altona - Mordialloc (SMARTBUS Service)
10:36PM - 706 - Mordialloc - Aspendale - Edithvale - Chelsea
01:32AM - 706 - Mordialloc - Aspendale - Edithvale - Chelsea
02:51AM - 706 - Mordialloc - Aspendale - Edithvale - Chelsea
10:36PM - 706 - Mordialloc - Aspendale - Edithvale - Chelsea
通过观看浏览器每60秒发出一次请求来找到URL。您可以通过更改format string轻松地调整时间,例如使用"%A %I:%M%p"