我正在使用python和美丽的汤从网页中提取数据并且它有效。问题是它没有将所有值都插入到csv文件中。就像我提取10个数据值而不是只有第10个数据值转到csv文件一样,第9个数据值则没有。所有10个数据值都显示在终端上,但不显示在csv文件中。
import csv
import urllib.request
from bs4 import BeautifulSoup
# specify the url
quote_page = "https://www.cardekho.com/Hyundai/Gurgaon/cardealers"
#quote_page = input("Enter Data Source Here : ")
page = urllib.request.urlopen(quote_page)
# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, "lxml")
# Take out the <div> of name and get its value
delrname = soup.find_all('div', class_='deleadres')
for name in delrname:
dname = name.find('div', class_="delrname").text # name
print(dname)
for address in delrname:
dadres = address.find('p').text
print(dadres)
for mobile in delrname:
dmobile = mobile.find('div', class_="clearfix").text
print(dmobile)
for email in delrname:
demail = email.find('div', class_="mobno").text
print(demail)
#exorting data into csv file....
with open('result.csv',newline='') as f:
r = csv.reader(f)
data = [line for line in r]
with open('result.csv','w',newline='') as f:
w = csv.writer(f)
w.writerow(['NAME','ADDRES','MOBILE','EMAIL'])
w.writerow([dname,dadres,dmobile,demail])**strong text**
答案 0 :(得分:1)
在for循环中指定值时,将替换以前的值。因此,在循环之外,您将获得最终值。
SELECT *
FROM main_table
LEFT OUTER JOIN (SELECT * FROM table1 LIMIT 100)a ON main_table.id = a.mainTableId
LEFT OUTER JOIN (SELECT * FROM table2 LIMIT 100)b ON main_table.id = b.mainTableId
...
在您的脚本中,使用单个for循环来提取值并将数据行写入csv。
for number in 1, 2, 3:
print(number) # prints 1, then 2, then 3
print(number) # prints only 3, since that was the final value.
答案 1 :(得分:0)
您的错误是您只保存循环中的最后一个值,因此您没有获得所有值。
另一种方法:
1)将您的值从循环添加到列表
2)将列表中的值添加到CSV
page = urllib.request.urlopen(quote_page)
# CREATE NEW LISTS
dname_list = list()
dadres_list = list()
dmobile_list = list()
demail_list = list()
# parse the html using beautiful soup and store in variable `soup`
soup = BeautifulSoup(page, "lxml")
# APPEND TO THE LIST
# Take out the <div> of name and get its value
delrname = soup.find_all('div', class_='deleadres')
for name in delrname:
dname = name.find('div', class_="delrname").text # name
print(dname)
dname_list.append(dname)
for address in delrname:
dadres = address.find('p').text
print(dadres)
dadres_list.append(dadres)
for mobile in delrname:
dmobile = mobile.find('div', class_="clearfix").text
print(dmobile)
dmobile_list.append(dmobile)
for email in delrname:
demail = email.find('div', class_="mobno").text
print(demail)
demail_list.append(demail)
#exorting data into csv file....
with open('result.csv',newline='') as f:
r = csv.reader(f)
data = [line for line in r]
with open('result.csv','w',newline='') as f:
w = csv.writer(f)
w.writerow(['NAME','ADDRES','MOBILE','EMAIL'])
# TRAVERSE THROUGH THE LIST
for i in range(len(dname)):
try:
w.writerow([dname_list[i],dadres_list[i],dmobile_list[i],demail_list[i]])
except IndexError:
print('')
PS:哈肯答案是一种更好的方法。我只是想让你知道另一种方法。