使用'for'循环和字符串与整数问题循环url

时间:2018-06-13 00:09:00

标签: python python-3.x url web-scraping beautifulsoup

尝试弄清楚如何使用“for”循环(或任何其他循环)来循环网址

我是数据抓取Howlongtobeat.com网址的结构如下:

  

https://howlongtobeat.com/game.php?id=38050

只有“id =”末尾的数字发生变化,如何才能让字符串的结尾改变数字?

page_number = range (38040, 38060)

url = 'https://howlongtobeat.com/game.php?id={page_number}'

这不起作用,因为我没有添加到字符串

url = 'https://howlongtobeat.com/game.php?id=' + page_number 

无效,因为我得到了这个错误

 TypeError: must be str, not range

仅供参考使用beautifulsoup和csv writer废弃数据并将其写入csv

我是这方面的初学者,所以从顶部开始

感谢!!!!!!!

1 个答案:

答案 0 :(得分:0)

from bs4 import BeautifulSoup

url = 'https://howlongtobeat.com/game.php?id='

for page in range(38040, 38060):
    new_url = url + str(page)
    print(new_url)

输出:

C:\Users\siva\Desktop>python test.py
https://howlongtobeat.com/game.php?id=38040
https://howlongtobeat.com/game.php?id=38041
https://howlongtobeat.com/game.php?id=38042
https://howlongtobeat.com/game.php?id=38043
https://howlongtobeat.com/game.php?id=38044
https://howlongtobeat.com/game.php?id=38045
https://howlongtobeat.com/game.php?id=38046
https://howlongtobeat.com/game.php?id=38047
https://howlongtobeat.com/game.php?id=38048
https://howlongtobeat.com/game.php?id=38049
https://howlongtobeat.com/game.php?id=38050
https://howlongtobeat.com/game.php?id=38051
https://howlongtobeat.com/game.php?id=38052
https://howlongtobeat.com/game.php?id=38053
https://howlongtobeat.com/game.php?id=38054
https://howlongtobeat.com/game.php?id=38055
https://howlongtobeat.com/game.php?id=38056
https://howlongtobeat.com/game.php?id=38057
https://howlongtobeat.com/game.php?id=38058
https://howlongtobeat.com/game.php?id=38059