使用urllib.urlretrieve通过HTTP下载文件无法正常工作

时间:2013-12-15 19:29:39

标签: python download

我仍在使用我的mp3下载程序,但现在我遇到了正在下载的文件的问题。我有两个版本的部件让我沮丧。第一个给我一个正确的文件但导致错误。第二个给我一个太小但没有错误的文件。我试过以二进制模式打开文件,但这没有帮助。我很擅长使用HTML进行任何工作,所以任何帮助都会得到认可。

import urllib
import urllib2

def milk():
    SongList = []
    SongStrings = []
    SongNames = []
    earmilk = urllib.urlopen("http://www.earmilk.com/category/pop")
    reader = earmilk.read()
    #gets the position of the playlist
    PlaylistPos = reader.find("var newPlaylistTracks = ")
    #finds the number of songs in the playlist
    NumberSongs = reader[reader.find("var newPlaylistIds = " ): PlaylistPos].count(",") + 1
    initPos = PlaylistPos

    #goes though the playlist and records the html address and name of the song

    for song in range(0, NumberSongs):
        songPos = reader[initPos:].find("http:") + initPos
        namePos = reader[songPos:].find("name") + songPos
        namePos += reader[namePos:].find(">")
        nameEndPos = reader[namePos:].find("<") + namePos
        SongStrings.append(reader[songPos: reader[songPos:].find('"') + songPos])
        SongNames.append(reader[namePos + 1: nameEndPos])
        initPos = nameEndPos

    for correction in range(0, NumberSongs):
        SongStrings[correction] = SongStrings[correction].replace('\\/', "/")

    #downloading songs

    fileName = ''.join([a.isalnum() and a or '_' for a in SongNames[0]])
    fileName = fileName.replace("_", " ") + ".mp3"


#         This version writes a file that can be played but gives an error saying: "TypeError: expected a character buffer object"
##    songDL = open(fileName, "wb")
##    songDL.write(urllib.urlretrieve(SongStrings[0], fileName))


#         This version creates the file but it cannot be played (file size is much smaller than it should be)
##    url = urllib.urlretrieve(SongStrings[0], fileName)
##    url = str(url)
##    songDL = open(fileName, "wb")
##    songDL.write(url)


    songDL.close()

    earmilk.close()

1 个答案:

答案 0 :(得分:2)

重新阅读the documentation for urllib.urlretrieve

  

返回一个元组(文件名,标题),其中filename是本地文件   可以在其下找到对象的名称,以及标题   返回urlopen()返回的对象的info()方法(对于a   远程对象,可能已缓存)。

您似乎期望它返回文件本身的字节。 urlretrieve的要点是它处理为你写的文件,并返回它被写入的文件名(如果你提供的话,它通常与函数的第二个参数相同)。