如何获得真实数量的Instagram粉丝,而不是' k'使用python

时间:2017-10-10 20:37:14

标签: python python-3.x web-scraping instagram instagram-api

我遇到了一个问题,我得到了' k'刮掉Instagram粉丝的数量而不是实际的数字时的缩写。

import requests, os, time, sys
from bs4 import BeautifulSoup
import pandas as pd

def insta_info(account_name):
    html = requests.get('https://www.instagram.com/%s/'%(account_name)) 
    soup = BeautifulSoup(html.text, 'lxml')
    data = soup.find_all('meta', attrs={'property':'og:description'})
    text = data[0].get('content').split()
    user = '%s %s %s' % (text[-3], text[-2], text[-1])
    followers = text[0]
    following = text[2]
    lst = []
    lst.append(followers)
    lst.append(following)
    return lst

kellz = insta_info(kellz_ocho)
print(kellz)

返回:

[14.2k, 608]

当我希望它返回时:

[14241, 608]

有没有办法让这种情况发生?我没有写上面的代码,而是我在网上发现并实现了它。因此,我并不确切如何运作。请注意,我想要抓取的帐户是公开的。

非常感谢!

3 个答案:

答案 0 :(得分:0)

您提供的代码绝对不是正确的方法。请不要使用它。

从这个链接可以看出:https://www.instagram.com/developer/endpoints/users/获取用户信息非常简单。如果您不想编写要进行身份验证的代码,您甚至可以从此处获取访问令牌:http://instagram.pixelunion.net/

答案 1 :(得分:0)

为了获得你想要的东西,你需要将selenium与BeautifulSoup结合使用,因为在页面源中你没有在meta标签中找到这样的东西;相反,唯一可用的东西就是你已经拥有的东西。试试这个:

from bs4 import BeautifulSoup ; from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.instagram.com/kellz_ocho/")
soup = BeautifulSoup(driver.page_source,"html.parser")
driver.quit()

for title in soup.select("._h9luf"):   
    posts = title.select("._fd86t")[0].text
    follower = title.select("._fd86t")[1]['title']
    following = title.select("._fd86t")[2].text
    print("Posts: {}\nFollower: {}\nFollowing: {}".format(posts,follower,following))

结果:

Posts: 59
Follower: 14,253
Following: 608

顺便说一下,关注者状态已经发生了变化。

答案 2 :(得分:0)

这应该有效。基本上,附加代码检查'k'并将剩余部分乘以1000,如果有'k'

import requests, os, time, sys
from bs4 import BeautifulSoup
import pandas as pd

def insta_info(account_name):
    html = requests.get('https://www.instagram.com/%s/'%(account_name)) 
    soup = BeautifulSoup(html.text, 'lxml')
    data = soup.find_all('meta', attrs={'property':'og:description'})
    text = data[0].get('content').split()
    user = '%s %s %s' % (text[-3], text[-2], text[-1])
    followers = text[0]
    if followers[-1] == 'K':
        followers = int(float(followers[:-1].encode('UTF-8')) * 1000)
    else:
        followers = int(float(followers.encode('UTF-8')))
    following = text[2]
    if following[-1] == 'K':
        following = int(float(following[:-1].encode('UTF-8')) * 1000)
    else:
        following = int(float(following.encode('UTF-8')))
    lst = []
    lst.append(followers)
    lst.append(following)
    return lst

kellz = insta_info(kellz_ocho)
print(kellz)