使用Beautifulsoup刮掉转推的数量

时间:2017-02-16 11:47:47

标签: python beautifulsoup

您好我正在使用Beautifulsoup来抓取Twitter数据,我想为每条推文收取转发数量,我的下面是我的代码

import urllib2
from bs4 import BeautifulSoup

url = "https://twitter.com/nokia"
response = urllib2.urlopen(url)
soup = BeautifulSoup(response,"html.parser")
tweets = soup.findAll('li',{"class":'js-stream-item'})
for tweet in tweets:
    if tweet.find('p',{"class":'tweet-text'}):
        tweet_user = tweet.find('span',{"class":'username'}).text
        tweet_text = tweet.find('p',{"class":'tweet-text'}).text.encode('utf8')
        retweets = tweet.find('span',{"class":"ProfileTweet-actionCount"}).text
        print(tweet_user)
        print(tweet_text)
        print(retweets)
    else:
        continue

我能够获得tweet_user和tweet_text,但有些人无法获得转推数量,有人可以解释我如何获得转推数量

1 个答案:

答案 0 :(得分:1)

虽然鼓励使用tweepy 您的代码几乎没有修改:

import requests
from bs4 import BeautifulSoup

url = "https://twitter.com/nokia"
response = requests.get(url)
soup = BeautifulSoup(response.text,"lxml")
tweets = soup.findAll('li',{"class":'js-stream-item'})
for tweet in tweets:
    if tweet.find('p',{"class":'tweet-text'}):
        tweet_user = tweet.find('span',{"class":'username'}).text.strip()
        tweet_text = tweet.find('p',{"class":'tweet-text'}).text.encode('utf8').strip()
        replies = tweet.find('span',{"class":"ProfileTweet-actionCount"}).text.strip()
        retweets = tweet.find('span', {"class" : "ProfileTweet-action--retweet"}).text.strip()
        print(tweet_user)
        print(tweet_text)
        print(replies)
        print(retweets)
    else:
        continue