我是Python的新手,我正尝试从网站上抓取一些文章链接。我设法从第一页抓取了所需的信息,但是我不知道如何对所有下一页进行相同的操作。我知道有很多关于此问题的文章,但我无法解决。我正在使用的代码是这样的:
from bs4 import BeautifulSoup
import requests
import csv
csv_file = open('cms_scrape.csv', 'w', newline='')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['date', 'link'])
base_url = 'https://www.khaleejtimes.com'
search_url = 'https://www.khaleejtimes.com/search&text=&content=articles&datefilter=24hours&sort=oldest&facet.filter=TaxonomyLeaf:Coronavirus%20outbreak'
def get_info(article):
date = article.find('div', class_= 'author_date').text
print(date)
link = base_url + article.find('a')['href']
print(link)
source = requests.get(search_url).text
soup = BeautifulSoup(source, 'lxml')
results = soup.find(class_= 'search_listing')
for article in results.find_all('li'):
get_info(article)
print()
csv_writer.writerow([date, link])
csv_file.close()
任何帮助将不胜感激。