PRAW:replace_more_comments()的进度条?

时间:2015-11-22 14:23:57

标签: python reddit praw

我一直在使用Python Reddit API Wrapper(PRAW)从Reddit收集特定的评论,而我常用的一个功能是replace_more_comments()来收集所有评论一个帖子。

其中一些线程非常大 - 例如10,000条评论 - 需要一段时间来收集所有评论。有没有办法显示replace_more_comments()的进度条?

这是一个最小的工作代码示例:

import praw
r = praw.Reddit('MSU vs Nebraska game')
submission = r.get_submission(submission_id='3rxx3y')
flat_comments = praw.helpers.flatten_tree(submission.comments)
submission.replace_more_comments(limit=None, threshold=0)
all_comments = submission.comments
flat_comments = praw.helpers.flatten_tree(submission.comments)

1 个答案:

答案 0 :(得分:0)

replace_more_comments的内置实现不支持此功能,但您可以编写自己的版本。供参考,here's the original implementation

我不知道如何绘制实际进度条;你必须写update_progress_bar。我还没有测试过这段代码,它可能根本不起作用。

def replace_more_comments(self, post):
    """Update the comment tree by replacing instances of MoreComments."""
    if post._replaced_more:
        return

    more_comments = post._extract_more_comments(comment.comments)

    # Estimate the total number of comments
    count = 0
    for item in more_comments:
        count += item.count

    update_progress_bar(0, count)

    num_loaded = 0

    while more_comments:
        item = heappop(more_comments)

        # Fetch new comments and decrease remaining if a request was made
        new_comments = item.comments(update=False)
        elif new_comments is None:
            continue

        # Re-add new MoreComment objects to the heap of more_comments
        for more in self._extract_more_comments(new_comments):
            more._update_submission(post)  # pylint: disable=W0212
            heappush(more_comments, more)
        # Increase progress bar
        num_loaded += len(new_comments)
        update_progress_bar(num_loaded, count)
        # Insert the new comments into the tree
        for comment in new_comments:
            post._insert_comment(comment)

    post._replaced_more = True