在Python中用逗号和引号分隔字段?

时间:2013-11-21 14:26:47

标签: python csv quotes comma

我正在尝试将此csv文件分成2D列表。我的代码目前的问题是它在数据中用引号切断了几行字段。那里有引号表示其中的逗号不是字段逗号分隔的一部分,实际上是该字段的一部分。我发布了代码,示例数据和示例输出。由于引号,您可以看到第一个输出行与其余字段相比如何跳过几个字段。我需要对正则表达式行做什么?在此先感谢您的帮助。

以下是代码的一部分:

import sys
import re
import time

# get the date
date = time.strftime("%x")


# function for reading in each line of file
# returns array of each line
def readIn(file):
    array = []
    for line in file:
        array.append(line)
    return array


def main():
    data = open(sys.argv[1], "r")
    template = open(sys.argv[2], "r")
    output = open(sys.argv[3], "w")

    finalL = []

    dataL = []
    dataL = readIn(data)

    templateL = []
    templateL = readIn(template)

    costY = 0
    dateStr = ""

    # split each line in the data by the comma unless there are quotes
    for i in range(0, len(dataL)):
        if '"' in dataL[i]:
            Pattern = re.compile(r'''((?:[^,"']|"[^"]*"|'[^']*')+)''')
            dataL[i] = Pattern.split(dataL[i])[1::2]
            for j in range(0, len(dataL[i])):
                dataL[i][j] = dataL[i][j].strip()
        else:       
            temp = dataL[i].strip().split(",")
            dataL[i] = temp

数据示例:

OrgLevel3: ATHLET ,,,,,,,,
,,,,,,,,
Name,,,Calls,,Duration,Cost ($),,
,,,,,,,,
ATHLET Direct,,,"1,312 ",,62:58:18,130.62 ,,
,,,,,,,,
Grand Total for ATHLET:,,,"1,312 ",,62:58:18,130.62 ,,
,,,,,,,,
OrgLevel3: BOOK ,,,,,,,,
,,,,,,,,
Name,,,Calls,,Duration,Cost ($),,
,,,,,,,,
BOOK Direct,,,434 ,,14:59:18,28.09 ,,
,,,,,,,,
Grand Total for BOOK:,,,434 ,,14:59:18,28.09 ,,
,,,,,,,,
OrgLevel3: CARD ,,,,,,,,
,,,,,,,,
Name,,,Calls,,Duration,Cost ($),,
,,,,,,,,
CARD Direct,,,253 ,,09:02:54,14.30 ,,
,,,,,,,,
Grand Total for CARD:,,,253 ,,09:02:54,14.30 ,,

示例输出:

['Grand Total for ATHLET:', '"1,312 "', '62:58:18', '130.62', '']
['Grand Total for BOOK:', '', '', '434 ', '', '14:59:18', '28.09 ', '', '']
['Grand Total for CARD:', '', '', '253 ', '', '09:02:54', '14.30 ', '', '']

1 个答案:

答案 0 :(得分:0)

如果您尝试将CS​​V加载到列表中,那么您执行此操作的完整代码是:

import csv

with open(sys.argv[1]) as data:
    dataL = list(csv.reader(data))

如果您的示例数据是您的输入数据,那么它需要事先做其他工作......,例如:

dataL = [row for row in csv.reader(data) if row[0].startswith('Grand Total for')]