Question

我想抓一个网站的喜欢。使用BeautifulSoup，这是我到目前为止所得到的：

user = 'LazadaMalaysia'

url = 'https://www.facebook.com/'+ user
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
f = soup.find('div', attrs={'class': '_4bl9'})

我收到的f输出如下：

<div class="_4bl9 _3bcp"><div aria-keyshortcuts="Alt+/" aria-label="Pembantu Navigasi" class="_6a _608n" id="u_0_8" role="menubar"><div class="_6a uiPopover" id="u_0_9"><a aria-expanded="false" aria-haspopup="true" class="_42ft _4jy0 _55pi _2agf _4o_4 _63xb _p _4jy3 _517h _51sy" href="#" id="u_0_a" rel="toggle" role="button" style="max-width:200px;"><span class="_55pe">Bahagian-bahagian pada halaman ini</span><span class="_4o_3 _3-99"><i class="img sp_m7lN5cdLBIi sx_d3bfaf"></i></span></a></div><div class="_6a _3bcs"></div><div class="_6a mrm uiPopover" id="u_0_b"><a aria-expanded="false" aria-haspopup="true" class="_42ft _4jy0 _55pi _2agf _4o_4 _3_s2 _63xb _p _4jy3 _4jy1 selected _51sy" href="#" id="u_0_c" rel="toggle" role="button" style="max-width:200px;" tabindex="-1"><span class="_55pe">Bantuan Kebolehcapaian</span><span class="_4o_3 _3-99"><i class="img sp_m7lN5cdLBIi sx_0a4c0e"></i></span></a></div></div></div>

我使用了此链接中的代码：How do I scrape the about section of a Facebook page?

不幸的是，它不起作用，我无法理解为什么会这样。这是我要抓的部分：

Answer 1

喜欢的内容是在课堂内的＃span;＃4; _4-u3 _5sqi _5sqk＆＃34;。这里是提取喜欢的代码。

import requests
from bs4 import BeautifulSoup
user = 'LazadaMalaysia'
url = 'https://www.facebook.com/'+ user
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
f = soup.find('div', attrs={'class': '_4-u3 _5sqi _5sqk'})
likes=f.find('span',attrs={'class':'_52id _50f5 _50f7'}) #finding span tag inside class
print(likes.text)

我希望我已经解决了你的问题。

Answer 2

我发现以下内容很容易实现。它应该获取喜欢和关注的次数。

import re
import requests
from bs4 import BeautifulSoup

def get_info(user,url):
    response = requests.get(f'{url}{user}')
    soup = BeautifulSoup(response.text,'lxml')
    like = soup.find("div",text=re.compile('people like this')).text
    follow = soup.find("div",text=re.compile('people follow this')).text
    print(f'likes: {like}\nfollows: {follow}\n')

if __name__ == '__main__':
    url = "https://www.facebook.com/"
    users = ['LazadaMalaysia','ronaldo','rihanna']
    [get_info(user,url) for user in users]

刮刮脸喜欢

2 个答案: