For循环迭代没有预期的影响

时间:2018-06-20 09:52:08

标签: python-3.x

我有以下代码,它们抓取一个网站,并将结果写到一个csv文件中。问题在于,出于某种原因,for循环会打印每个迭代的多个副本,而该副本仅应将每个迭代写入一次。有人可以帮忙,指出我在这里想念的是什么吗? 谢谢

import requests
from bs4 import BeautifulSoup
import csv

url = 'https://online.computicket.com'
home_page = requests.get(url)

home_page.content

soup = BeautifulSoup(home_page.content, 'lxml')


links = soup.find_all('a', {'class':'info'})

next_link = []

for link in links:
    next_link.append(link.get("href"))


for i in range(0, len(next_link),1):    
    next_link.append(i)
    print(url + next_link[i])
    new_url = requests.get(url + next_link[i])   

    for link in (url + next_link[i]):
        new_url.content
        soup = BeautifulSoup(new_url.content, 'lxml')

        info_name = soup.find('div', {'class' : 'es-cost'}) 
        heading = soup.find('h1',{'class' : 'full'})

        with open('Don.csv', 'a') as csv_file:

            #csv_file.write(heading.get_text())
            for name in soup.find_all('div', {'class' : 'es-cost'}):
                csv_file.write(heading.get_text())
                csv_file.write(name.get_text())

                print(name.get_text())

1 个答案:

答案 0 :(得分:0)

由于嵌套的for循环,我认为您的程序可以打印多个副本。但是,它的link变量不在循环内的任何地方使用。尝试删除嵌套的for语句,替换此部分代码:

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   

for link in (url + next_link[i]):
    new_url.content
    soup = BeautifulSoup(new_url.content, 'lxml')

    info_name = soup.find('div', {'class' : 'es-cost'}) 
    heading = soup.find('h1',{'class' : 'full'})

    with open('Don.csv', 'a') as csv_file:

        #csv_file.write(heading.get_text())
        for name in soup.find_all('div', {'class' : 'es-cost'}):
            csv_file.write(heading.get_text())
            csv_file.write(name.get_text())

            print(name.get_text())

与此

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   

new_url.content
soup = BeautifulSoup(new_url.content, 'lxml')

info_name = soup.find('div', {'class' : 'es-cost'}) 
heading = soup.find('h1',{'class' : 'full'})

with open('Don.csv', 'a') as csv_file:

    #csv_file.write(heading.get_text())
    for name in soup.find_all('div', {'class' : 'es-cost'}):
        csv_file.write(heading.get_text())
        csv_file.write(name.get_text())

        print(name.get_text())