Question

我正在尝试从此site

中提取电影标题和链接。

from bs4 import BeautifulSoup
from requests import get


link = "https://tamilrockerrs.ch"
r = get(link).content
#r = open('json.html','rb').read()
b = BeautifulSoup(r,'html5lib')
a = b.findAll('p')[1]

但是问题是标题没有标签。我无法提取标题，如果可以的话，如何将链接和标题绑定在一起。

预先感谢

Answer 1

您可以通过这种方式找到title和link。

from bs4 import BeautifulSoup
import requests    

url= "http://tamilrockerrs.ch"

response= requests.get(url)

data = response.text

soup = BeautifulSoup(data, 'html.parser')

data = soup.find_all('div', {"class":"title"})

for film in data:
    print("Title:", film.find('a').text) # get the title here 
    print("Link:",  film.find('a').get("href")) #get the link here

BeautifulSoup python：获取没有标签的文本并获取相邻的链接

1 个答案: