BeautifulSoup在Django视图上缓慢但在本地正常

时间:2018-04-18 12:25:49

标签: python django performance beautifulsoup

我在Django视图中使用BeautifulSoup进行网页抓取,以获取一些img src并在我的页面上显示它们。我的问题是:当我在Jupyter Notebook中执行代码时,执行此任务所需的时间不到一秒,但是当我在django视图中执行此操作时,它需要大约10秒钟(取决于查询)。

这是我的代码:

from bs4 import BeautifulSoup
import requests
import re
try:
    import urllib.request as urllib2
except ImportError:
    import urllib2
import time
import json


def get_soup(url,header):
    return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header)),'html.parser')

def get_images(query):
    start_time = time.time()
    url="https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch"
    header={'User-Agent':"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36"
    }
    soup = get_soup(url,header)
    print("--- %s seconds ---" % (time.time() - start_time))
    start_time = time.time()
    ActualImages=[]# contains the link for Large original images, type of  image
    for a in soup.find_all("div",{"class":"rg_meta"}):
        link =json.loads(a.text)["ou"]
        ActualImages.append(link)
    print("--- %s seconds ---" % (time.time() - start_time))
    dic = {'images': ActualImages[:10]}
    return dic

get_images('thunder')

问题出在运行' get_soap()'功能。这是正常的吗?有什么想让它更快?

0 个答案:

没有答案