到目前为止,当我使用网页抓取工作时,我发现分页的整数增加或继续下一页。如:
Los%20Angeles%2C%20CA&page=1
Los%20Angeles%2C%20CA&page=2
Los%20Angeles%2C%20CA&page=3
我用以下方式处理了这些:
for i in range(1,4):
url = "Los%20Angeles%2C%20CA&page={0}".format(i)
然而,今天我偶然发现了按字母顺序排列的分页,无法将其从A增加到B等等。例如:
browse-business-directory/char:A
browse-business-directory/char:B
browse-business-directory/char:C
我正在使用python编写脚本。
答案 0 :(得分:1)
您可以在Python中使用ord函数。性格' a'以97为价值。所以,
for i in range(97, 123):
url = "browse-business-directory/char:{0}".format(chr(i))