BeautifulSoup python:获取没有标签的文本并获取相邻的链接

时间:2019-03-06 10:10:32

标签: python-3.x web-scraping beautifulsoup

我正在尝试从此site

中提取电影标题和链接。
from bs4 import BeautifulSoup
from requests import get


link = "https://tamilrockerrs.ch"
r = get(link).content
#r = open('json.html','rb').read()
b = BeautifulSoup(r,'html5lib')
a = b.findAll('p')[1]

但是问题是标题没有标签。我无法提取标题,如果可以的话,如何将链接和标题绑定在一起。

预先感谢

1 个答案:

答案 0 :(得分:1)

您可以通过这种方式找到titlelink

from bs4 import BeautifulSoup
import requests    

url= "http://tamilrockerrs.ch"

response= requests.get(url)

data = response.text

soup = BeautifulSoup(data, 'html.parser')

data = soup.find_all('div', {"class":"title"})

for film in data:
    print("Title:", film.find('a').text) # get the title here 
    print("Link:",  film.find('a').get("href")) #get the link here