Question

我试图在网站http://immobilienscout.de上搜集一些有关房屋清单的数据。到目前为止，除了一件事之外，我设法刮掉了所有需要的数据：上市代理的电话号码。

问题是我无法理解到达文本的路径。

让我们举例说我想找到价格。我的查找代码如下：

Html代码：

<div class="is24-phone palm-hide" data-is24-phone-number-block="" data-ng-show="!showPhoneNumbers" data-position="top">
            <div class="is24-show-phone-button print-hide hide">
              <span class="fa fa-phone font-lightgray"></span>
              <a href="javascript:void(0);" class="internal-link"><font><font>Show phone number</font></font></a>
            </div>
            <div class="is24-phone-number">
              <p>
                  <span><font><font>Mobil:</font></font></span><font><font> 0162 2056442</font></font></p>
              <p>
                  <span><font><font>Phone:</font></font></span><font><font> 030 72021143</font></font></p>
              </div>
          </div>

我的代码如下所示：

link = "https://www.immobilienscout24.de/expose/96068611"   
html = urllib2.urlopen(link)   
soup = BeautifulSoup(html, "html.parser")

findMobile = soup.find('div', attrs={'class': 'is24-phone-number'})
print findMobile.text.strip()

无输出。相反，我需要输出为：0162 2056442。

任何帮助？

Answer 1

如果您在Chrome中打开该页面，则应该能够右键单击要抓取的内容并点击“检查元素”。然后，在再次弹出的DOM视图中，右键单击该元素并选择Copy＆gt;复制选择器。这应该给你一个看起来像

的css选择器

#sidebar > div.module.community-bulletin > div > div:nth-child(10) > div.bulletin-item-content > a

然后，您应该只需执行

即可选择该元素

soup.select("#sidebar > div.module.community-bulletin > div > div:nth-child(10) > div.bulletin-item-content > a")

修改：以下是.select()的文档：https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors

以下是一个例子：

>>> from bs4 import BeautifulSoup
>>> import requests
>>> r = requests.get("https://stackoverflow.com/questions/45224417/how-to-scrape-hidden-phone-number-from-website-using-beautiful-soup-4/45224481#45224481")
>>> soup = BeautifulSoup(r.text, 'html.parser')
>>> soup.select("#comment-77415832 > td.comment-text > div > span.comment-copy")
[<span class="comment-copy">I tried to use your code for the element I am interested but the output is an empty list. Any ideas how to solve this?</span>]

如何使用Beautiful Soup 4从网站上删除隐藏的电话号码

1 个答案: