导入社交媒体粉丝

时间:2018-03-20 02:49:19

标签: python beautifulsoup jupyter-notebook

我在Python 2.7上运行此代码一段时间,以便自动收集Twitter和Instagram上某些帐户的关注者数量。

此代码可以实时向我提供篮球运动员的Twitter和Instagram粉丝的数量。

本周代码已经停止工作,我找不到解决方法。如果有人有解决方案,我真的很感激!

import requests
from json import loads

Username = "nolimittb"
r = requests.get('https://www.instagram.com/'+Username)
html = r.text.encode("utf-8")
text = html[html.index("window._sharedData = ")+21:]
text = (text[:text.index("};</script>")]+"}").replace('\\"', "")
dictionary= loads(text)
data = dictionary["entry_data"]["ProfilePage"][0][user]

print "Thomas Bryant Instagram:"
print str(data["followed_by"]["count"]) + ' Followers'

from bs4 import BeautifulSoup
import requests
username='nolimittb31'
url = 'https://www.twitter.com/'+username
r = requests.get(url)
soup = BeautifulSoup(r.content)

f = soup.find('li', class_="ProfileNav-item--followers")
title = f.find('a')['title']
print 'Thomas Bryant Twitter:'
print title

num_followers = int(title.split(' ')[0].replace(',',''))

1 个答案:

答案 0 :(得分:0)

如果你print dictionary["entry_data"]["ProfilePage"][0],那么你会得到这个:

{
    'logging_page_id': 'profilePage_1706986508',
    'show_suggested_profiles': False,
    'graphql': {
        'user': {
            'biography': "Official IG of TB\nLos Angeles Lakers\nA Hoosier at heart!! \nI will never be the same.. In Jesus' name.. Amen",
            'blocked_by_viewer': False,
            'country_block': False,
            'external_url': 'https://www.bookcameo.com/thomasbryant',
            'external_url_linkshimmed': 'https://l.instagram.com/?u=https%3A%2F%2Fwww.bookcameo.com%2Fthomasbryant&e=ATPQyJlxPSdd8tAG-VVbepnurfbkUgrQTP8OsVKTH6d3cSec9dSlcW7JCevxuXWdZb0UtCzc',
            'edge_followed_by': {
                'count': 62109
            },
            ...
        ...
    ...
}

其结构与您尝试索引的结构略有不同。因此,如果您想从返回的数据中访问关注者,则需要访问./graphql/user/edge_followed_by/count

你可以这样做:

print dictionary["entry_data"]["ProfilePage"][0]["graphql"]["user"]["edge_followed_by"]["count"]
# Outputs: 62115

或直接修改代码,它看起来像这样:

...

data = dictionary["entry_data"]["ProfilePage"][0]["graphql"]["user"]

print "Thomas Bryant Instagram:"
print str(data["edge_followed_by"]["count"]) + ' Followers'

输出:

Thomas Bryant Instagram:
62115 Followers