Question

我想从一些Twitter帖子中抓取推文，我用于那个BeautifulSoop库。我想做的是获取原始帖子和所有回复，如果有任何回复（但所有回复）。我设法得到原始帖子，我写了这个循环来得到我所有的回复，但它只返回我的第一个。请帮忙谢谢！

from bs4 import BeautifulSoup
import urllib.request

url= "https://twitter.com/20Minutes/status/692778440211169280"

list_Original_message =[]

readfile=urllib.request.urlopen(url).read()
soup = BeautifulSoup(readfile)

# ..... the first part of my script is set to scrape the original post, I omit it # because it works!

# loop to get the replies :

replies = soup.find_all('ol',{"class":'stream-items js-navigable-stream'})
for m in replies :
    name = m.findAll('strong',class_="fullname js-action-profile-name show-popup-with-id")[0]
    print(name.string)
    profile = m.findAll('span',class_="username js-action-profile-name")[0]
    print(profile.get_text())
    link = m.findAll('a',class_="tweet-timestamp js-permalink js-nav js-tooltip")[0]['href']
    print('https://twitter.com'+link)
    time = m.findAll('a',class_="tweet-timestamp js-permalink js-nav js-tooltip")[0]['title']
    print(time)
    message = m.findAll('p',class_="TweetTextSize js-tweet-text tweet-text")[0]
    print (message.get_text())

这是我得到的结果，只有第一个回复：

Mais l＆＃39; eau dit

@Queen_MeloMau

https://twitter.com/Queen_MeloMau/status/692797851139710978

2016年1月28日上午11:54

@20Minutes dite moi que c＆est; est une blagounette la @slavicdelrey

Answer 1

实际上只有第一（几条）推文发送给您原始请求，其余推文异步加载。使用Twitter API，他们出于某种原因。

Beautifulsoup for loop并没有获得所有元素

1 个答案: