我正在尝试抓取此site,并想要提取呼叫按钮内的联系号码。
如何实现此代码?
答案 0 :(得分:1)
似乎正在使用电话号码检索html字符串的简单AJAX请求:
import re
import scrapy
class MySpider(scrapy.Spider):
name = 'sophone'
start_urls = [
'http://www.freeindex.co.uk/profile(the-main-event-management-company)_266537.htm'
]
def parse(self, response):
# item id can be extracted from url
item_id = re.findall("(\d+)\.htm", response.url)[0]
# phone api can be made using this id
url = 'http://www.freeindex.co.uk/customscripts' \
'/popup_view_tel_details.asp?id={}'.format(item_id)
yield scrapy.Request(url, self.parse_phone)
def parse_phone(self, response):
from scrapy.shell import inspect_response
inspect_response(response, self)