网页抓取返回空白

时间:2020-03-29 13:25:42

标签: web-scraping beautifulsoup

嗨,我正尝试将这张表中的数据抓取,但结果为空

我对此很陌生,因此感谢您的帮助

import requests
from bs4 import BeautifulSoup


cookies = {
    'sf-trckngckie': 'c9fd3a61-50f0-4bd0-8525-5b6a7a6e2196',
    '_ga': 'GA1.2.1731919508.1585340823',
    '_gid': 'GA1.2.1902580924.1585340823',
    '_fbp': 'fb.1.1585340823437.1647391105',
    '_xpid': '10731301',
    '_xpkey': '2JcY2pSuNJnZp7_luu59-8_Ty0vO4B38',
    '__qca': 'P0-1629792242-1585340823404',
    'CookieCheckboxes': '^%^5B^%^5D',
    'ApplicationGatewayAffinity': '12edde54e46ecf95244ef6ed73a5b6520b6e0f39c639062865023c047f171751',
    'ApplicationGatewayAffinityCORS': '12edde54e46ecf95244ef6ed73a5b6520b6e0f39c639062865023c047f171751',
    '_dc_gtm_UA-63215094-6': '1',
}

headers = {
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36',
    'Sec-Fetch-Dest': 'document',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-User': '?1',
    'Referer': 'https://www.dublinairport.com/flight-information/live-departures',
    'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
}

response = requests.get('https://www.dublinairport.com/flight-information/live-arrivals', headers=headers, cookies=cookies)

#print(response.content)
soup = BeautifulSoup(response.content, 'html.parser')

#Beautiful Soup command to organise the html data
#print(soup.prettify())

#Could run this to show the title of the site without html tags
#print(soup.title.string)

当我使用find_all('a')运行下一行时,它正在工作。但是我真正想要的是按如下方式刮擦表数据,什么也没产生


cells = soup.find_all('td')
for cell in cells:
    print(cell.string)

0 个答案:

没有答案