我写了一个脚本,根据姓名和盖子抓取某些商店的地址和电话号码。它的搜索方式是,它分别从csv文件中存储在A列和B列中的Name和Lid。但是,在根据搜索获取结果之后,我希望解析器将结果分别放在第C列和第D列中,如第二张图所示。此时,我卡住了。我不知道如何使用读取或写入方法操纵第三和第四列,以便将数据放在那里。我现在正在尝试这个:
import csv
import requests
from lxml import html
Names, Lids = [], []
with open("mytu.csv", "r") as f:
reader = csv.DictReader(f)
for line in reader:
Names.append(line["Name"])
Lids.append(line["Lid"])
with open("mytu.csv", "r") as f:
reader = csv.DictReader(f)
for entry in reader:
Page = "https://www.yellowpages.com/los-angeles-ca/mip/{}-{}".format(entry["Name"].replace(" ","-"), entry["Lid"])
response = requests.get(Page)
tree = html.fromstring(response.text)
titles = tree.xpath('//article[contains(@class,"business-card")]')
for title in titles:
Address= title.xpath('.//p[@class="address"]/span/text()')[0]
Contact = title.xpath('.//p[@class="phone"]/text()')[0]
print(Address,Contact)
我的csv文件现在如何:
我想要的输出类似于:
答案 0 :(得分:1)
你可以这样做。创建一个新的输出csv文件,其标题基于输入csv,并添加两列。当您阅读csv行时,它可用作字典,在本例中称为entry
。您可以根据您在网络上收集的内容将新值添加到此词典中。然后将每个新创建的行写入文件。
import csv
import requests
from lxml import html
with open("mytu.csv", "r") as f, open('new_mytu.csv', 'w', newline='') as g:
reader = csv.DictReader(f)
newfieldnames = reader.fieldnames + ['Address', 'Phone']
writer = csv.writer = csv.DictWriter(g, fieldnames=newfieldnames)
writer.writeheader()
for entry in reader:
Page = "https://www.yellowpages.com/los-angeles-ca/mip/{}-{}".format(entry["Name"].replace(" ","-"), entry["Lid"])
response = requests.get(Page)
tree = html.fromstring(response.text)
titles = tree.xpath('//article[contains(@class,"business-card")]')
#~ for title in titles:
title = titles[0]
Address= title.xpath('.//p[@class="address"]/span/text()')[0]
Contact = title.xpath('.//p[@class="phone"]/text()')[0]
print(Address,Contact)
new_row = entry
new_row['Address'] = Address
new_row['Phone'] = Contact
writer.writerow(new_row)