如何申请超过一页的循环并将每一页添加到csv文件

时间:2019-02-07 09:55:05

标签: python web-scraping

如何从一页以上的文件中获取更多数据到我的csv文件中

from bs4 import BeautifulSoup
import requests
import csv
source = requests.get('https://software-overzicht.nl/amersfoort?page=1','https://software-overzicht.nl/amersfoort?page=2' ).text
soup = BeautifulSoup(source, 'lxml')
csv_file = open('cms_scrape.csv','w')
csv_writter = csv.writer(csv_file)
csv_writter.writerow(['naambedrijf', 'adress'])
for search in soup.find_all('div', class_='company-info-top'):
    title = search.a.text
    adress = search.p.text
    for page in range(1, 22):
        url = 'https://software-overzicht.nl/amersfoort?page={}'.format(page)
    print(title)
    csv_writter.writerow([title,adress])
csv_file.close()`

1 个答案:

答案 0 :(得分:0)

您只需要将requests.get()和整个过程移到页面范围的循环中即可。

from bs4 import BeautifulSoup
import requests
import csv

with open('C:/cms_scrape.csv','w', newline='') as f:
    csv_writter = csv.writer(f)
    csv_writter.writerow(['naambedrijf', 'adress'])

    for page in range(1, 22):
        url = 'https://software-overzicht.nl/amersfoort?page={}'.format(page)
        source = requests.get(url).text
        soup = BeautifulSoup(source, 'lxml')

        for search in soup.find_all('div', class_='company-info-top'):
            title = search.a.text.strip()
            adress = search.p.text.strip()

            print(title)
            csv_writter.writerow([title,adress])