Question

我的csv来自bloomberg，其格式如下：

Time Interval,Close,Net Chg,Open,High,Low,Tick Count,Volume
05SEP2012,,,,,,,
09:15 - 09:30,97.722,0,98.34,98.34,97.722,2,37155
09:30 - 09:45,97.899,0.177,98.164,98.164,97.281,102,101725
09:45 - 10:00,97.722,-0.177,97.899,97.899,97.193,32,39874
06SEP2012,,,,,,,
09:15 - 09:30,98.076,0.883,98.076,98.076,98.076,1,22429
09:30 - 09:45,97.193,-0.883,97.634,97.987,97.104,72,67741
09:45 - 10:00,96.928,-0.265,97.193,97.193,96.751,80,148963...

如果我想统一格式，以便[日期XX / XX / 201X +时间XX：XX-XX：XX]成为匹配的关键，它可能看起来像：

Date,Time Interval,Close,Net Chg,Open,High,Low,Tick Count,Volume
05SEP2012,,,,,,,,
05SEP2012,09:15 - 09:30,97.722,0,98.34,98.34,97.722,2,37155
05SEP2012,09:30 - 09:45,97.899,0.177,98.164,98.164,97.281,102,101725
05SEP2012,09:45 - 10:00,97.722,-0.177,97.899,97.899,97.193,32,39874
06SEP2012,,,,,,,,
06SEP2012,09:15 - 09:30,98.076,0.883,98.076,98.076,98.076,1,22429
06SEP2012,09:30 - 09:45,97.193,-0.883,97.634,97.987,97.104,72,67741
06SEP2012,09:45 - 10:00,96.928,-0.265,97.193,97.193,96.751,80,148963...

愿任何人告诉我，我应该写什么代码？我是一个非常新的编程和尝试编写关于学校项目的配对交易的python程序。这个article的内容是我的主要参考，当输入数据时，它无法输入我们收集的csv数据。

Answer 1

 for python 3

 import csv
    with open('data.csv', 'r', newline='') as f,  open('data_out.csv', 'w', newline='') as f_out:
        reader = csv.reader(f,quotechar='"')
        # read headers
        headers = next(reader)
        # insert new column name
        headers.insert(0,"Date")

        w = csv.writer(f_out, delimiter=',' )
        # write headers
        w.writerow(headers)

        for line in f:
            if ',,,' in line:
                newcolumn = line
                newcolumn = line.strip()
                newcolumn = newcolumn.replace(',','')
                f_out.write(line)
            else:
                line = newcolumn + ',' + line.strip()
                line = line.split(',')
                w.writerow(line)

for python 2.7

import csv
with open('data.csv', 'rb') as f,  open('data_out.csv', 'wb') as f_out:
    reader = csv.reader(f,quotechar='"')
    # read headers
    headers = next(reader)
    # insert new column name
    headers.insert(0,"Date")

    w = csv.writer(f_out, delimiter=',' )
    # write headers
    w.writerow(headers)

    for line in f:
        if ',,,' in line:
            newcolumn = line
            newcolumn = line.strip()
            newcolumn = newcolumn.replace(',','')
            f_out.write(line)
        else:
            line = newcolumn + ',' + line.strip()
            line = line.split(',')
            w.writerow(line)

    Date,Time Interval,Close,Net Chg,Open,High,Low,Tick Count,Volume
    05SEP2012,,,,,,,
    05SEP2012,09:15 - 09:30,97.722,0,98.34,98.34,97.722,2,37155
    05SEP2012,09:30 - 09:45,97.899,0.177,98.164,98.164,97.281,102,101725
    05SEP2012,09:45 - 10:00,97.722,-0.177,97.899,97.899,97.193,32,39874
    06SEP2012,,,,,,,
    06SEP2012,09:15 - 09:30,98.076,0.883,98.076,98.076,98.076,1,22429
    06SEP2012,09:30 - 09:45,97.193,-0.883,97.634,97.987,97.104,72,67741
    06SEP2012,09:45 - 10:00,96.928,-0.265,97.193,97.193,96.751,80,148963

Answer 2

# First open your file:
csv_file = open(path_to_file, 'r')

# Initialize list to hold the rows
rows = []

# For each line in your file, split values into a list and add to the rows list
for line in csv_file:
    rows.append(line.split(','))

现在每个行元素都是一个结构相似的列表。你可以比较类似的＆＃34;细胞＆＃34; - 比如说，第一和第二行的第一列：

rows [1] [0] vs rows [2] [0]，请记住列表索引是从零开始的。

希望这会让你顺利上路，

欢呼声

如何在同一列上统一csv文件的数据格式

2 个答案: