Question

我想废弃网站上的数据，但我遇到了一个小问题，我不知道如何解决这个问题。（我的第一个刮削工具，使用beautifulsoup和要求）我需要右边“07xx xxx xxx”的电话号码

当我第一次打开页面并请求它时，我得到了这个：

问题是我需要电话号码，但是在我按下“Arata telefon”之前它不会显示，有什么方法可以取出这些信息吗？

这是页面本身： Link

Answer 1

您只需将ID从6rqd4传递到http://olx.ro/ajax/misc/contact/phone：

In [22]: import requests

In [23]: requests.get("http://olx.ro/ajax/misc/contact/phone/6rqd4").json()
Out[23]: {'value': '0787 636 258'}

因此，如果你有很多网址，你可以用正则表达式提取ID：

In [30]: import requests

In [31]: from bs4 import BeautifulSoup

In [32]: import re

In [33]: patt = re.compile("ID(\w+)\.html")

In [34]: url = "http://olx.ro/oferta/chirie-zona-camine-hasdeu-fac-medicina-apartament-2-camere-78-mp-ID6rQD4.html#"

In [35]: requests.get("http://olx.ro/ajax/misc/contact/phone/{}".format(patt.search(url).group(1))).json()
Out[35]: {'value': '0787 636 258'}

Scrap数据表单ClickTracking

1 个答案: