我编写了一个脚本,该脚本从电子商务网站提取数据,并且我已经使用bs4抓取页面的内容并请求提取数据。当我在计算机上本地运行脚本时,一切正常。列出数据需要3-4秒,但是可以。现在,当我在Heroku上部署脚本时,问题就开始了。即使将其推送到Heroku之后,脚本仍可以正常工作,但运行缓慢,并且最令人讨厌的部分是它经常崩溃。因此,它将像6-7次刮擦数据,然后将引发大量错误。作为初学者,我无法从中获得任何收益。这是从Heroku中找到的完整回溯日志:
2020-09-11T18:39:48.896959+00:00 app[worker.1]: Traceback (most recent call last):
2020-09-11T18:39:48.897027+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
2020-09-11T18:39:48.897328+00:00 app[worker.1]: conn = connection.create_connection(
2020-09-11T18:39:48.897333+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
2020-09-11T18:39:48.897547+00:00 app[worker.1]: raise err
2020-09-11T18:39:48.897569+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
2020-09-11T18:39:48.897793+00:00 app[worker.1]: sock.connect(sa)
2020-09-11T18:39:48.897834+00:00 app[worker.1]: OSError: [Errno 113] No route to host
2020-09-11T18:39:48.897835+00:00 app[worker.1]:
2020-09-11T18:39:48.897891+00:00 app[worker.1]: During handling of the above exception, another exception occurred:
2020-09-11T18:39:48.897892+00:00 app[worker.1]:
2020-09-11T18:39:48.897898+00:00 app[worker.1]: Traceback (most recent call last):
2020-09-11T18:39:48.897898+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
2020-09-11T18:39:48.898299+00:00 app[worker.1]: httplib_response = self._make_request(
2020-09-11T18:39:48.898322+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connectionpool.py", line 381, in _make_request
2020-09-11T18:39:48.898652+00:00 app[worker.1]: self._validate_conn(conn)
2020-09-11T18:39:48.898672+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connectionpool.py", line 978, in _validate_conn
2020-09-11T18:39:48.899235+00:00 app[worker.1]: conn.connect()
2020-09-11T18:39:48.899238+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connection.py", line 309, in connect
2020-09-11T18:39:48.899483+00:00 app[worker.1]: conn = self._new_conn()
2020-09-11T18:39:48.899488+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
2020-09-11T18:39:48.899630+00:00 app[worker.1]: raise NewConnectionError(
2020-09-11T18:39:48.899656+00:00 app[worker.1]: urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fd5906c0250>: Failed to establish a new connection: [Errno 113] No route to host
2020-09-11T18:39:48.899658+00:00 app[worker.1]:
2020-09-11T18:39:48.899658+00:00 app[worker.1]: During handling of the above exception, another exception occurred:
2020-09-11T18:39:48.899659+00:00 app[worker.1]:
2020-09-11T18:39:48.899661+00:00 app[worker.1]: Traceback (most recent call last):
2020-09-11T18:39:48.899678+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
2020-09-11T18:39:48.899896+00:00 app[worker.1]: resp = conn.urlopen(
2020-09-11T18:39:48.899899+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
2020-09-11T18:39:48.900165+00:00 app[worker.1]: retries = retries.increment(
2020-09-11T18:39:48.900180+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment
2020-09-11T18:39:48.900369+00:00 app[worker.1]: raise MaxRetryError(_pool, url, error or ResponseError(cause))
2020-09-11T18:39:48.900409+00:00 app[worker.1]: urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.flipkart.com', port=443): Max retries exceeded with url: /search?q=shoes&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd5906c0250>: Failed to establish a new connection: [Errno 113] No route to host'))
2020-09-11T18:39:48.900411+00:00 app[worker.1]:
2020-09-11T18:39:48.900411+00:00 app[worker.1]: During handling of the above exception, another exception occurred:
2020-09-11T18:39:48.900412+00:00 app[worker.1]:
2020-09-11T18:39:48.900412+00:00 app[worker.1]: Traceback (most recent call last):
2020-09-11T18:39:48.900414+00:00 app[worker.1]: File "server.py", line 103, in <module>
2020-09-11T18:39:48.900542+00:00 app[worker.1]: reply= bot.flipkart(product= message_type)
2020-09-11T18:39:48.900567+00:00 app[worker.1]: File "/app/bot.py", line 86, in flipkart
2020-09-11T18:39:48.900823+00:00 app[worker.1]: datas= Test.scrape(product)
2020-09-11T18:39:48.900828+00:00 app[worker.1]: File "/app/Test.py", line 7, in __init__
2020-09-11T18:39:48.901017+00:00 app[worker.1]: self.source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(search_query)).content
2020-09-11T18:39:48.901049+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/api.py", line 76, in get
2020-09-11T18:39:48.901257+00:00 app[worker.1]: return request('get', url, params=params, **kwargs)
2020-09-11T18:39:48.901262+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/api.py", line 61, in request
2020-09-11T18:39:48.901466+00:00 app[worker.1]: return session.request(method=method, url=url, **kwargs)
2020-09-11T18:39:48.901471+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
2020-09-11T18:39:48.901887+00:00 app[worker.1]: resp = self.send(prep, **send_kwargs)
2020-09-11T18:39:48.901891+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
2020-09-11T18:39:48.902410+00:00 app[worker.1]: r = adapter.send(request, **kwargs)
2020-09-11T18:39:48.902413+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
2020-09-11T18:39:48.902823+00:00 app[worker.1]: raise ConnectionError(e, request=request)
2020-09-11T18:39:48.902882+00:00 app[worker.1]: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.flipkart.com', port=443): Max retries exceeded with url: /search?q=shoes&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd5906c0250>: Failed to establish a new connection: [Errno 113] No route to host'))
2020-09-11T18:39:48.991351+00:00 heroku[worker.1]: Process exited with status 1
2020-09-11T18:39:49.047690+00:00 heroku[worker.1]: State changed from up to crashed
我很抱歉没有共享整个代码。我本来可以共享的,但是我已经将两个或三个文件链接在一起了,所以在这里无法共享整个代码。我非常努力,但无法理解错误,因此非常感谢您的帮助!
答案 0 :(得分:1)
您显示的错误是由于没有互联网或互联网速度慢所致。 尝试检查是否存在正确的互联网(如果无法正常工作),重新启动您当前的python环境