使用PRAW抓取Reddit时收到HTTP响应400

时间:2018-12-20 23:52:33

标签: python http response http-status-code-400 praw

我正在尝试使用PRAW抓取Reddit,并且它始终抛出prawcore.exceptions.BadRequest: received 400 HTTP response错误。

在Jupyter Notebook中进行实验时,我设法创建了整个功能管道,从Reddit检索数据绝对没有问题。仅当我尝试使用终端将代码作为脚本运行时,才会出现此问题。最初,我认为问题与笔记本(v3.6.5)和虚拟环境(v3.7.1)中的不同Python版本有关。但是,即使我将环境切换到3.6.5,该错误仍然存​​在。

当我使用嵌套的Reddit循环测试它们的输出时,实例化Subreddit对象,Submission对象,Comment对象和for对象没有问题。就我的数据管道而言,我有一堆函数以相似的嵌套模式相互调用。尽管如此,即使我调用函数的方式在结构上类似于嵌套循环,它仍然会与生成器有关。

这是终端输出:

Traceback (most recent call last):
  File "run_reddit_scraper.py", line 295, in <module>
    reddit_id = process_reddit(reddit, SUBREDDIT_NAMES)
  File "run_reddit_scraper.py", line 200, in process_reddit
    subreddits_pk, subreddit_count = process_subreddits(reddit, subreddit_names)
  File "run_reddit_scraper.py", line 165, in process_subreddits
    submissions_pk, submission_count = process_submissions(subreddit)
  File "run_reddit_scraper.py", line 119, in process_submissions
    for submission in top_submissions:
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/praw/models/listing/generator.py", line 52, in __next__
    self._next_batch()
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/praw/models/listing/generator.py", line 62, in _next_batch
    self._listing = self._reddit.get(self.url, params=self.params)
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/praw/reddit.py", line 391, in get
    data = self.request('GET', path, params=params)
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/praw/reddit.py", line 506, in request
    params=params)
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/prawcore/sessions.py", line 185, in request
    params=params, url=url)
  File "/Users/nicktheodore/reddit-scraper/env/lib/python3.6/site-packages/prawcore/sessions.py", line 130, in _request_with_retries
    raise self.STATUS_EXCEPTIONS[response.status_code](response)
prawcore.exceptions.BadRequest: received 400 HTTP response

我现在不知道发生了什么,因此我完全被封锁了。任何反馈都非常感谢!

0 个答案:

没有答案