网页中多个页面的网页抓取

时间:2020-05-22 15:52:25

标签: python-3.x web-scraping

for i in range(int(pages)): # pages = 6 (for example)
    print("iteration",i)
    url = "https://xxxxxx.io/requests/?module=xxxxx&dummy=&page="+str(i+1)  
    response_new = requests.get(url)
    soup_new = BeautifulSoup(response_new.text,"html.parser")
    table = soup_new.findAll('table')[1]
    tr = table.findAll(['tr'])[1:] 
    print(len(tr))
    csvFile = open(r"C:\Users\.ipynb_checkpoints\Splits_22052020",'wt',newline='',encoding='utf-8')
    writer = csv.writer(csvFile)  
    for cell in tr:
        th = cell.find_all('th')
        th_data = [col.text.strip('\n') for col in th]
        td = cell.find_all('td')
        row = [i.text.replace('\n','') for i in td]
        writer.writerow(th_data+row)      

csvFile.close()

这是我的代码,我迭代了没有页面的循环(ex = 6)。代码已成功运行,但是.csv文件中仅保存了第一页数据表。这里有什么问题。谁能帮我。谢谢

0 个答案:

没有答案