在python 2.7中 - 如何从csv读取数据,重新格式化数据,并写入新的csv

时间:2014-04-28 21:27:47

标签: python python-2.7

我是编程菜鸟,我被卡住了...我的目标是打开一个csv文件(从程序A导出),重新格式化数据,并将其写入新的csv文件(对于程序B)进口)。我知道我的代码并不漂亮,但它会在写入新的csv文件之前一直工作。它只写入旧csv的最后一行数据。

import csv

print 'Enter file location or press enter for default location.'
aPath = raw_input('> ')  # user path to file
if aPath == '':
    aPath = '/default/path/to/file'  # default path
aFile = raw_input('Enter file name: ')  # user file name
if aFile == '':
    aFile = 'orginal.csv'  # default file name
myCSV = csv.DictReader(open(aPath + aFile, 'r'), delimiter=',')
print myCSV.fieldnames # verify fieldnames

for p in myCSV:
    w = csv.writer(open("output.csv", "w"))
    try:
        p = dict((k, v) for k, v in p.iteritems()
            if v.lower() != 'null')
    except AttributeError, e:
        print e
        print p
        raise Exception()
# reformats the columns of data for the output.csv
    pID = p.get('Last Name')[-4:]
    pName = p.get('Last Name')[:-4].strip() + ', ' + p.get('First Name')
    pDate = p.get('Start Time')[:-5]
    pBlank = p.get('')
    pCourse = p.get('Assigned Through')
    pScore = p.get('Score')[:-1]

# verifies the new columns
    print pID, pName, pDate, pBlank, pCourse, pBlank, pBlank, pScore

# NOT working...Only writes the last row of the orginal.csv to output.csv
    w.writerow([pID, pName, pDate, pBlank, pCourse, pBlank, pBlank, pScore])

3 个答案:

答案 0 :(得分:3)

您正在重新打开文件并在每次通过for p in myCSV循环时覆盖它。在进入for循环之前,您需要创建一次w

答案 1 :(得分:1)

以下修改后的代码应该有效:

import csv

print 'Enter file location or press enter for default location.'
aPath = raw_input('> ')  # user path to file
if aPath == '':
    aPath = '/default/path/to/file'  # default path
aFile = raw_input('Enter file name: ')  # user file name
if aFile == '':
    aFile = 'orginal.csv'  # default file name
myCSV = csv.DictReader(open(aPath + aFile, 'r'), delimiter=',')
print myCSV.fieldnames # verify fieldnames

with open("output.csv", "wb") as ofile:
    w = csv.writer(ofile)
    for p in myCSV:
        try:
            p = dict((k, v) for k, v in p.iteritems()
                if v.lower() != 'null')
        except AttributeError, e:
            print e
            print p
            raise Exception()
    # reformats the columns of data for the output.csv
        pID = p.get('Last Name')[-4:]
        pName = p.get('Last Name')[:-4].strip() + ', ' + p.get('First Name')
        pDate = p.get('Start Time')[:-5]
        pBlank = p.get('')
        pCourse = p.get('Assigned Through')
        pScore = p.get('Score')[:-1]

    # verifies the new columns
        print pID, pName, pDate, pBlank, pCourse, pBlank, pBlank, pScore

    # NOT working...Only writes the last row of the orginal.csv to output.csv
        w.writerow([pID, pName, pDate, pBlank, pCourse, pBlank, pBlank, pScore])

我们改变的事情:

  1. 我们使用了with运算符。使用打开和关闭文件时,使用with是一个普遍接受的协议,除非另有说明,因为它可以正确处理文件的关闭等。
  2. 使用wb代替w作为open的第二个参数。这将允许您以写二进制模式打开文件。请参阅this以供参考。
  3. 希望这有帮助。

答案 2 :(得分:0)

我担心你会在每次迭代时重写你的输出文件...... 你应该简单地走

w = csv.writer(open("output.csv", "w"))

离开循环:

w = csv.writer(open("output.csv", "w"))
try:
    for p in myCSV:
        try:
            ...
            w.writerow([pID, pName, pDate, pBlank, pCourse, pBlank, pBlank, pScore])
finally:
    w.close()

您还可以查看""构造自动处理潜在的io错误并在块结束时关闭文件而不需要额外的尝试。