使用python3在网站上查找文件大小

时间:2014-09-13 09:59:48

标签: python file find size using

这是我的代码>> 当我使用此代码时,我得到正确的输出

url = "http://songs.djmazadownload.com/music/indian_movies/Creature%20%282014%29/01%20-%20Creature%20-%20Sawan%20Aaya%20Hai%20%5BDJMaza.Info%5D.mp3"

site = urllib.request.urlopen(url)
print ( (  round((site.length / 1024),2) / 1024) , "Mb")

但是,当我使用此代码时

url = "http://songs.djmazadownload.com/music/indian_movies/Creature (2014)/01 - Creature - Sawan Aaya Hai [DJMaza.Info].mp3"

site = urllib.request.urlopen(url)
print ( (  round((site.length / 1024),2) / 1024) , "Mb")

我收到了一些错误。

Traceback (most recent call last):
  File "C:\Python_Mass_downloader\New folder\download.py", line 35, in <module>
    site = urllib.request.urlopen(url)
  File "C:\Python34\lib\urllib\request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python34\lib\urllib\request.py", line 461, in open
    response = meth(req, response)
  File "C:\Python34\lib\urllib\request.py", line 571, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python34\lib\urllib\request.py", line 499, in error
    return self._call_chain(*args)
  File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 579, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

我的完整代码是&gt;&gt;

def audios(link):
    print("Enter the Minimum size of file to be downloaded\n")
    size1 = input('>\t')

    url = urllib.request.urlopen(link)
    content = url.read()
    soup = BeautifulSoup(content)
    links = [a['href'] for a in soup.find_all('a',href=re.compile('http.*\.(mp3|wav|ogg|wma|flac)'))]
    print (str(len(links)) + " Audios Found ")
    print("\n".join(links))




#For Downloading
    dest = "C:\\Downloads\\" # or '~/Downloads/' on linux
    for i in range(len(links)):
        a_link = links[i]
        #a_link2 = urllib.request.urlopen(a_link)
        site = urllib.request.urlopen(a_link)
        size2 = int((site.length / 1024) / 1024) 

        if size 1 >= size2:


            obj = SmartDL(a_link, dest)
            obj.start()

            path = obj.get_dest()

**所以我需要在下载之前找到文件的大小。但我从页面获得的链接采用这种格式&gt;&gt; http://songs.djmazadownload.com/music/indian_movies/Creature(2014)/ 01 - 生物 - Sawan Aaya Hai [DJMaza.Info] .mp3

如何获取其他格式的链接? http://songs.djmazadownload.com/music/indian_movies/Creature%20%282014%29/01%20-%20Creature%20-%20Sawan%20Aaya%20Hai%20%5BDJMaza.Info%5D.mp3

**

1 个答案:

答案 0 :(得分:1)

使用

from urllib.parse import quote_plus

....

url = quote_plus(url)

site = urllib.request.urlopen(url)

print ( (  round((site.length / 1024),2) / 1024) , "Mb")

...
打开前告诉你你得到了什么。

这里也有语法错误:

if size 1 >= size2:
#should be:
if size1 >= size2: