我正在处理网上报废公司的数据。我的代码太慢了,所以我正在尝试并发期货。我不知道自己缺少什么。这是输入文本文件3_1.txt:
AQ VENTURA PVT. LTD.
AQLU LEARNING PVT LTD
Aqquarate Solutions
Aqua Centric Pvt Ltd
AQUA EASY INFO TECH
Aqua Filmtec Pvt ltd
aqua sms
Aqua Soft Water Systems
AQUA SPA
这是我的代码:
import requests
import pandas as pd
from bs4 import BeautifulSoup
import concurrent.futures
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0'}
with open('3_1.txt', 'r') as f_in:
companies = [line.strip() for line in f_in if line.strip()]
all_data = []
threads = 2
for company in companies:
print(company)
def data(company):
soup = BeautifulSoup(
requests.get('https://google.com/search', params={'q': company, 'hl': 'en'}, headers=headers).content,
'html.parser')
address = soup.select_one('.LrzXr')
if address:
address = address.text
else:
address = 'Not Found'
phone = soup.select_one('.LrzXr.zdqRlf.kno-fv')
if phone:
phone = phone.text
else:
phone = 'Not Found'
all_data.append({"Company": company, "Address": address, "Phone": phone})
with concurrent.futures.ThreadPoolExecutor(max_workers=threads) as executor:
executor.map(data, company)
df = pd.DataFrame(all_data)
df.to_csv('Companydata.csv')
我得到了输出,但它并不重要。 请帮我。谢谢。