Question

尝试创建一个可以从sub_id列表中提取注释的PRAW scraper。仅返回最后一个sub_ids注释数据。

我猜我一定要覆盖一些东西。我已经查看了其他问题，但因为我使用PRAW它有特定的参数，我无法弄清楚哪些可以/应该被替换。

sub_ids = ["2ypash", "7ocvlb", "7okxkf"]

for sub_id in sub_ids:

    submission = reddit.submission(id=sub_id)

    submission.comments.replace_more(limit=None, threshold=0)

comments = submission.comments.list()

commentlist = []
for comment in comments:

    commentsdata = {}
    commentsdata["id"] = comment.id
    commentsdata["subreddit"] = str(submission.subreddit)
    commentsdata["thread"] = str(submission.title)
    commentsdata["author"] = str(comment.author)
    commentsdata["body"] = str(comment.body)
    commentsdata["score"] = comment.score
    commentsdata["created_utc"] = datetime.datetime.fromtimestamp(comment.created_utc)
    commentsdata["parent_id"] = comment.parent_id

    commentlist.append(commentsdata)

Answer 1

缩进是你的垮台。您的代码失败的原因是因为comments仅在 sub_ids完成循环后才分配。因此，当您遍历comments时，他们只是最后sub_id的{{1}}。

首先，在comments循环之前移出commentlist = []（以便它在第1行之后）

其次，for（包括）以后的所有内容都需要缩进，以便它们在comments = submission.comments.list()次迭代中运行。

这是最终应该是什么样的：

sub_ids

PRAW for循环仅返回最后一个值

1 个答案: