无效的网址“”:未提供架构。也许您的意思是http://? -Django

时间:2019-10-05 04:36:28

标签: django python-3.x

我是Django的初学者。我正在做一个数据抓取项目,我已经编写了这段代码,但是在下载CSV文件时遇到了问题。 我在文件中使用了功能“下载”,但没有得到想要的结果。相反,我收到此错误

Invalid URL '': No schema supplied. Perhaps you meant http://?

这是我的代码

views.py

def index(request):
    if request.method == "POST":

        url  = request.POST.get('url', '')

        down = request.POST.get('download','')

        r = requests.get(url)
        soup = BeautifulSoup(r.content, features="lxml")
        p_name = soup.find_all("h2",attrs={"class": "a-size-mini"})
        p_price = soup.find_all("span",attrs={"class": "a-price-whole"})
        p_image = soup.findAll('img', {'class':'s-image','src':re.compile('.jpg')})

        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="product_file.csv"'


        for name,price,image in zip(p_name,p_price,p_image):
            writer = csv.writer(response)
            row = writer.writerow([name.text, price.text,image['src']])

            name_data  = [data.text for data in p_name]
            price_data = [data.text for data in p_price]
            image_data = [data['src'] for data in p_image]
            dec = {'name':name_data, 'price':price_data, 'image':image_data}



        if down:
            return response



    else:
        dec = {}
    return render(request, 'index.html',dec)

当我删除此“ if down:”时,它将正确下载我的csv文件;当我保留if条件时,它将引发错误:

Invalid URL '': No schema supplied. Perhaps you meant http://?

index.html

<div class="container">
<div class="row justify-content-md-center">
    <div class="col-md-4">
        <form method="POST" action="">{% csrf_token %}
            <h1 class="mb-3 display-4">Amazone Scraper</h1>
            <input type="text" id="url" name="url" class="form-control" placeholder="URL" required autofocus>
            <button class="mt-3 btn btn-lg btn-primary btn-block" type="submit" id="submit" name='submit'>Scrap</button>
        </form>
        <p class="mt-3"><a href="upload">Upload</a> Your File For Updates Regarding</p>
        <form action="" method="post">{% csrf_token %}<!--------download---------->
          <input class="mt-3 btn btn-info" type="submit" id="download" name='download' value='Download'/>
        </form>
    </div>
</div>
<div class="row">

1 个答案:

答案 0 :(得分:2)

问题是您有两种形式,当您单击download按钮时,它将从不包含url字段的第二种形式发送数据。因此,url值在您的视图中为空。您应该重构此视图以仅使用一种形式。

或者您可以尝试将url字段添加到第二种形式,并使用第一个拳头中的url作为默认值:

<div class="container">
<div class="row justify-content-md-center">
    <div class="col-md-4">
        <form method="POST" action="">{% csrf_token %}
            <h1 class="mb-3 display-4">Amazone Scraper</h1>
            <input type="text" id="url" name="url" class="form-control" placeholder="URL" required autofocus>
            <button class="mt-3 btn btn-lg btn-primary btn-block" type="submit" id="submit" name='submit'>Scrap</button>
        </form>
        <p class="mt-3"><a href="upload">Upload</a> Your File For Updates Regarding</p>
        <form action="" method="post">{% csrf_token %}<!--------download---------->

                                              

您还需要向模板上下文中添加url

    for name,price,image in zip(p_name,p_price,p_image):
        writer = csv.writer(response)
        row = writer.writerow([name.text, price.text,image['src']])

        name_data  = [data.text for data in p_name]
        price_data = [data.text for data in p_price]
        image_data = [data['src'] for data in p_image]
        dec = {'name':name_data, 'price':price_data, 'image':image_data, 'url': url}

请注意,以下架构请求将向第三方URL发送两次。因此,我想您应该重构视图并仅使用一种形式来代替“剪贴并下载”。