使用Beautifulsoup,我用
在页面上找到投票值。vote = container.find('span', attrs = {'name':'nv'})['data-value']
我如何找到Gross的值,因为它的跨度名称相同?
答案 0 :(得分:1)
可能使用findAll
并选择第二项以获取Gross字段的值。例如:
elements = container.findAll('span', attrs = {'name':'nv'})
votes = elements[0]['data-value']
gross = elements[1]['data-value']
答案 1 :(得分:1)
做事的方式不是很Python,但是我有点喜欢。
from bs4 import BeautifulSoup
import requests
def get_imdb_data(url):
data = requests.get(url)
soup = BeautifulSoup(data.text)
divs = soup.findAll('div', {'class':'lister-item'})
movies = []
for div in divs:
movie = {}
movie['name'] = div.find('h3').find('a').text
spans = votes = gross = None
try:
spans = div.findAll('span', {'name':'nv'})
try:
movie['votes'] = spans[0]['data-value']
except:
pass
try:
movie['gross'] = spans[1]['data-value']
except:
pass
except:
pass
movies.append(movie)
return movies
url = 'https://www.imdb.com/search/title?release_date=2018&sort=num_votes,desc&page=1'
data = get_imdb_data(url)
print(data)