I am very new to python. Trying to learn as much as I can while doing projects to maintain interest levels.
In the code below, I am trying to scrape information from a website and get all the Company names and address etc into an excel file. I think I need to define how the excel rows and columns need to be assigned for each of the iterations/companies. I am just drawing a blank on how exactly to go about it.
import requests, os
from bs4 import BeautifulSoup
from openpyxl import Workbook
from openpyxl import load_workbook
url = "https://dir.indiamart.com/search.mp?ss=Power+Distribution+Transformers"
r = requests.get(url)
soup = BeautifulSoup(r.content)
links = soup.find_all("a")
for link in links:
print("<a href='%s'>%s</a>" % (link.get("href"), link.text))
g_data = soup.find_all("div", {"class": "nes"})
c = []
d = []
for item in g_data:
c.append(item.contents[3].text)
d.append(item.contents[1].text)
wb = load_workbook("Trial.xlsx")
ws1 = wb.get_sheet_by_name("Sheet1")
for i in c:
ws1["A2"] = i
wb.save("Trial.xlsx")
for x in d:
ws1["B2"] = x
wb.save("Trial.xlsx")
答案 0 :(得分:1)
import requests, bs4, re, csv
url = 'https://dir.indiamart.com/search.mp?ss=Power+Distribution+Transformers'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
blocks = soup.find_all('div', class_='lst')
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
for b in blocks:
name = b.find(class_='cnm').get_text(strip=True)
addr = b.find(class_='clg').get_text(strip=True)
call = b.find(class_='ls_co phn').find(text=re.compile('\d+')).strip()
writer.writerow([name, addr, call])
出:
"Padmavahini Transformers Private Limited, Coimbatore","Saravanampatti, CoimbatoreS. F. No. 353/1, Door No. 7/140, Ruby Matriculation School Road Keeranatham, Saravanampatti,Coimbatore-641035,Tamil Nadu",8071681548
Guru Teg Bahadur Metal Works,"Shimlapuri, LudhianaNo. 1621, Street No. 4, Kwality Road, Near Kwality Chowk Shimlapuri,Ludhiana-141003,Punjab",8079452881
Servokon Systems Ltd.,"Servokon House, New DelhiServokon House, C-13, Radhu Palace Road Opposite Scope Minar,New Delhi-110092,Delhi",8048077499
Muskaan Power Infrastructure Ltd,"Dhandari Kalan, LudhianaSua Road, Industrial Area - C, Dhandari Kalan,Ludhiana-141014,Punjab",8079465606
Tamilnadu Electricals,"Ambattur Industrial Estate, ChennaiNo. 95 - H, (SP) Ambattur Industrial Estate,Chennai-600058,Tamil Nadu",8046073728
L. D. Power Transformers Pvt. Ltd.,"Sector 3, NoidaA-9, Sector- 59, Phase- 3,Noida-201301,Uttar Pradesh",8048111124
Western Electricals (pvt.) Ltd.,"Kaman, PalgharS. No. 6, H. No. 1, ( Part), Behind Shanti Metal, Near Sai Service, Vasai - Kaman Road Sativali Village, Taluka Vasai (E),Palghar-401208,Maharashtra",8071683491
您可以使用CSV文件存储数据,然后在Excel中打开它。 CSV模块易于使用。