Python for loop问题中的网络抓取未返回预期数据

时间:2019-01-30 14:17:26

标签: python web-scraping beautifulsoup

我在使用BeautifulSoup来抓取F1网站时遇到了问题,在这里我使用网站上的for循环指定了我需要的数据,但是我只检索了一个结果,而不是类中的所有结果

下面是我的以下代码

import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")

for container in data:
    countryname = container.find_all("span", class_="name")
    country = countryname[0].text
    racetype = container.find_all("span", class_="race-type")
    rtype = racetype[0].text
    racetime = container.find_all("time", class_="day")
    racetimename = racetime[0].text.replace("\n", "").strip()
    print(country)

我的当前输出-

Australia

预期输出-

Australia

Bahrain

China

etc

谢谢!

1 个答案:

答案 0 :(得分:3)

罪魁祸首:

country = countryname[0].text

原因:

有21个国家/地区,而您仅以zeroth索引(即

)获取第一个国家/地区。

country = countryname[0].text

答案:

浏览“国家/地区名称”以查找所有元素:

  import requests
from bs4 import BeautifulSoup
from csv import writer

page = requests.get("https://www.formula1.com/")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="race-list")
#
# print(data)

for container in data:
  countryname = container.find_all("span", class_="name")
  for count in countryname:
      country = count.text
      racetype = container.find_all("span", class_="race-type")
      rtype = count.text
      racetime = container.find_all("time", class_="day")
      racetimename = count.text.replace("\n", "").strip()
      print(country)

输出:

Australia
Bahrain
China
Azerbaijan
Spain
Monaco
Canada
France
Austria
Great Britain
Germany
Hungary
Belgium
Italy
Singapore
Russia
Japan
Mexico
United States
Brazil
Abu Dhabi