我想抓一个网站的喜欢。使用BeautifulSoup,这是我到目前为止所得到的:
user = 'LazadaMalaysia'
url = 'https://www.facebook.com/'+ user
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
f = soup.find('div', attrs={'class': '_4bl9'})
我收到的f输出如下:
<div class="_4bl9 _3bcp"><div aria-keyshortcuts="Alt+/" aria-label="Pembantu Navigasi" class="_6a _608n" id="u_0_8" role="menubar"><div class="_6a uiPopover" id="u_0_9"><a aria-expanded="false" aria-haspopup="true" class="_42ft _4jy0 _55pi _2agf _4o_4 _63xb _p _4jy3 _517h _51sy" href="#" id="u_0_a" rel="toggle" role="button" style="max-width:200px;"><span class="_55pe">Bahagian-bahagian pada halaman ini</span><span class="_4o_3 _3-99"><i class="img sp_m7lN5cdLBIi sx_d3bfaf"></i></span></a></div><div class="_6a _3bcs"></div><div class="_6a mrm uiPopover" id="u_0_b"><a aria-expanded="false" aria-haspopup="true" class="_42ft _4jy0 _55pi _2agf _4o_4 _3_s2 _63xb _p _4jy3 _4jy1 selected _51sy" href="#" id="u_0_c" rel="toggle" role="button" style="max-width:200px;" tabindex="-1"><span class="_55pe">Bantuan Kebolehcapaian</span><span class="_4o_3 _3-99"><i class="img sp_m7lN5cdLBIi sx_0a4c0e"></i></span></a></div></div></div>
我使用了此链接中的代码:How do I scrape the about section of a Facebook page?
不幸的是,它不起作用,我无法理解为什么会这样。这是我要抓的部分:
答案 0 :(得分:7)
喜欢的内容是在课堂内的#span;#4; _4-u3 _5sqi _5sqk&#34;。这里是提取喜欢的代码。
import requests
from bs4 import BeautifulSoup
user = 'LazadaMalaysia'
url = 'https://www.facebook.com/'+ user
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
f = soup.find('div', attrs={'class': '_4-u3 _5sqi _5sqk'})
likes=f.find('span',attrs={'class':'_52id _50f5 _50f7'}) #finding span tag inside class
print(likes.text)
我希望我已经解决了你的问题。
答案 1 :(得分:0)
我发现以下内容很容易实现。它应该获取喜欢和关注的次数。
import re
import requests
from bs4 import BeautifulSoup
def get_info(user,url):
response = requests.get(f'{url}{user}')
soup = BeautifulSoup(response.text,'lxml')
like = soup.find("div",text=re.compile('people like this')).text
follow = soup.find("div",text=re.compile('people follow this')).text
print(f'likes: {like}\nfollows: {follow}\n')
if __name__ == '__main__':
url = "https://www.facebook.com/"
users = ['LazadaMalaysia','ronaldo','rihanna']
[get_info(user,url) for user in users]