从该网站抓取地址和电话号码

时间:2019-09-25 10:58:29

标签: python pandas beautifulsoup

我如何使用bs4和pandas库从和联系信息类中提取数据并导出到csv文件?来自this site?我需要有关如何从标记和联系信息类中删除数据的帮助。


import pandas as pd
import bs4
import requests
import re
full_dict={'Title':[],'Description':[],'Address':[]}
res=requests.get("https://cupcakemaps.com/cupcakes/cupcakes-near-me/p:2")   
listings=soup.findAll(class_='media')
for listing in listings:
    listing_title=listing.find(True,{'title':True}).attrs['title']
    listing_Description=listing.find('p',{'class':'summary-desc'})
    listing_address=listing.find('p',{'class':'contact-`info'}).text=re.compile(r'[0-9]{0,4}')`

1 个答案:

答案 0 :(得分:1)

  • .strip()-Python的内置函数用于删除字符串中的所有前导和尾随空格。
  • .to_csv()-将对象写入逗号分隔值(csv)文件。

例如。

import pandas as pd
from bs4 import BeautifulSoup,Tag
import requests
import re

res=requests.get("https://cupcakemaps.com/cupcakes/cupcakes-near-me/p:2")
soup = BeautifulSoup(res.text,'lxml')
listings=soup.findAll(class_='media')
data = []
for listing in listings:
    listing_title=listing.find(True,{'title':True}).attrs['title']
    listing_Description=listing.find('p',{'class':'summary-desc'})

    if isinstance(listing_Description,Tag):
        listing_Description = listing_Description.text.strip()

    listing_address=listing.find('p',{'class':'contact-info'})

    if isinstance(listing_address,Tag):
        number_text = listing_address.text.strip()
        listing_address = ''.join(filter(str.isdigit,number_text))

    full_dict = {'Title': listing_title, 'Description': listing_Description, 'Address': listing_address}
    data.append(full_dict)

df = pd.DataFrame(data)
# saved data into csv file
df.to_csv("contact.csv")
print(df)

O / P:

                                               Title                                        Description     Address
0  Explore Category 'Anaheim CA Birthday Cupcakes...  Delectable Anaheim, CA - Delectable check out ...  7147156086
1  Explore Category 'Costa Mesa CA Birthday Cupca...  Lisa's Gourmet Snacks Costa Mesa CA  check out...  7144275814
2  Explore Category 'Shorewood IL Birthday Cupcak...  Acapulco Bakery Inc Shorewood, IL - Acapulco B...  8157291737
3  Explore Category 'San Francisco CA Birthday Cu...  Hilda's Mart & Bake Shop San Francisco CA  che...  4153333122
4  Explore Category 'Los Angeles CA Birthday Cupc...  Lenny's Deli Los Angeles, CA - Lenny's Deli ch...  3104755771
5  Explore Category 'San Francisco CA Birthday Cu...  Sweet Inspirations San Francisco CA  check out...        None
6  Explore Category 'Costa Mesa CA Birthday Cupca...  The Cupcake Costa Mesa CA  check out  The Cupc...  9496420571
7  Explore Category 'Los Angeles CA Birthday Cupc...  United Bread & Pastry Inc Los Angeles CA  chec...  3236610037
8  Explore Category 'Garden Grove CA Birthday Cup...  Pescadores Garden Grove CA  check out  Pescado...  7145395585
9  Explore Category 'Bakersfield CA Birthday Cupc...  Bimbo Bakeries Usa Bakersfield CA  check out  ...  6613219352