I successfully scraped the first page of the website, but when I tried to scrape mutiples pages, it worked but the result is totally wrong.
Code:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
for num in range(1,15):
res = requests.get('http://www.abcde.com/Part?Page={num}&s=9&type=%8172653').text
soup = BeautifulSoup(res,"lxml")
for item in soup.select(".article-title"):
print(urljoin('http://www.abcde.com',item['href']))
It only changed one number in every page's url, for example,
http://www.abcde.com/Part?Page=1&s=9&type=%8172653
http://www.abcde.com/Part?Page=2&s=9&type=%8172653
I got total 14 pages of this.
My code worked, but it just repeatedly print out the first page's url for 14 times. The result I expected was to print out all different urls from different pages using loops.
答案 0 :(得分:2)
As Jon Clements pointed, format url as below :
res = requests.get('http://www.abcde.com/Part?Page={}&s=9&type=%8172653'.format(num)).text
You can find more about python format strings at pyformat.info.