Python CSV无法写入

时间:2014-03-27 18:39:37

标签: python csv

我有这段代码,它遍历URL的txt文件并搜索要下载的文件:

URLS = open("urlfile.txt").readlines()

def downloader():
    with open('data.csv', 'w') as csvfile:
        writer = csv.writer(csvfile)
        for url in downloadtools.URLS:
            try:
                html_data = urlopen(url)
            except:
                print 'Error opening URL: ' + url
                pass

            #Creates a BS object out of the open URL.
            soup = bs(html_data)
            #Parsing the URL for later use
            urlinfo = urlparse.urlparse(url)
            domain = urlparse.urlunparse((urlinfo.scheme, urlinfo.netloc, '', '', '', ''))
            path = urlinfo.path.rsplit('/', 1)[0]

            FILETYPE = ['\.pdf$', '\.ppt$', '\.pptx$', '\.doc$', '\.docx$', '\.xls$', '\.xlsx$', '\.wmv$', '\.mp4$', '\.mp3$']

            #Loop iterates through list of file types for open URL.
            for types in FILETYPE:
                for link in soup.findAll(href = compile(types)):
                    urlfile = link.get('href')
                    filename = urlfile.split('/')[-1]
                    while os.path.exists(filename):
                        try:
                            fileprefix = filename.split('_')[0]
                            filetype = filename.split('.')[-1]
                            num = int(filename.split('.')[0].split('_')[1])
                            filename = fileprefix + '_' + str(num + 1) + '.' + filetype
                        except:
                            filetype = filename.split('.')[1]
                            fileprefix = filename.split('.')[0] + '_' + str(1)
                            filename = fileprefix + '.' + filetype

                    #Creates a full URL if needed.
                    if '://' not in urlfile and not urlfile.startswith('//'):
                        if not urlfile.startswith('/'):
                            urlfile = urlparse.urljoin(path, urlfile)
                        urlfile = urlparse.urljoin(domain, urlfile)

                    #Downloads the urlfile or returns error for manual inspection
                    try:
                        urlretrieve(urlfile, filename, Percentage)
                        writer.writerow(['SUCCESS', url, urlfile, filename])
                        print "     SUCCESS"
                    except:
                        print "     ERROR"
                        writer.writerow(['ERROR', url, urlfile, filename])

除了未将数据写入CSV之外,一切正常。没有目录被更改(我知道,至少......)

脚本遍历外部URL列表,查找文件,正确下载文件,然后打印成功#34;成功"或"错误"没有问题。它唯一没做的就是将数据写入CSV文件。它将完整地运行而无需编写任何CSV数据。

我尝试在virtualenv中运行它,以确保没有任何奇怪的包问题。

我的嵌入式循环是否会导致CSV数据无法写入?

2 个答案:

答案 0 :(得分:2)

请尝试with open('data.csv', 'wb') as csvfile:

http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files

或者,构建一个可迭代代替writerow,然后使用writerows。如果以交互模式运行脚本,则可以查看可迭代行的内容。 (即[[' SUCCESS',...],[' SUCCESS',...],...])

import csv
with open('some.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(someiterable)

答案 1 :(得分:0)

因此,我让脚本完整运行,并且出于某种原因,数据在运行一段时间后开始写入CSV。我不知道如何解释。数据以某种方式存储在内存中并随机开始写入?我不知道,但与终端中打印的日志相比,数据准确无误。

怪异。