LOOP抓取数据时

时间:2019-03-02 01:17:57

标签: python web-scraping

我正在尝试使用循环抓取数据,这是代码

import requests
import json
import pandas as pd

parameters = ['a:1','a:2','a:3','a:4','a:3','a:4','a:5','a:6','a:7','a:8','a:9','a:10']

results = pd.DataFrame()
for item in parameters:
    key, value = item.split(':')
    url = "https://xxxx.000webhostapp.com/getNamesEnc02Motasel2.php?keyword=%s&type=2&limit=%s" %(key, value)
    r = requests.get(url)
    cont = json.loads(r.content)
    temp_df = pd.DataFrame(cont)
    results = results.append(temp_df)

results.to_csv('ScrapeData.csv', index=False)

此方法很好用,但是问题是我需要参数=直到'a:1000',我认为有一个更好的解决方案可以从'a:1'循环到'a:1000',而不是重复参数,就像我的代码中一样。

我真的需要您的帮助

2 个答案:

答案 0 :(得分:0)

value = 1
key = 'a'
while value <= 1000:
    url = .....%(key, str(value))
    ....
    ....
    value += 1

......

使用柜台

答案 1 :(得分:0)

使用可以使用for i in range(start, end)循环。像这样

results = pd.DataFrame()
key = 'a'

# Goes from 1 to 1000 (including both)
for value in range(1, 1001):
    url = f'https://xxxx.000webhostapp.com/getNamesEnc02Motasel2.php?keyword={key}&type=2&limit={value}'
    r = requests.get(url)
    cont = json.loads(r.content)
    temp_df = pd.DataFrame(cont)
    results = results.append(temp_df)

results.to_csv('ScrapeData.csv', index=False)