将csv中的数据和时间列转换为相对时间

时间:2015-11-17 19:03:33

标签: python csv datetime python-3.x data-analysis

我有一个csv文件,其中包含两个包含日期和时间的列。整个期间是24小时。我想把这两列并将其转换为从00:00:00 - 23:59:59开始的相对时间的单列。

Layer      date       time        Ht    Stat      Vg     Temp
57986   8/01/2015   13:53:05    0.00m    87      15.4    None
20729   8/01/2015   11:23:21    45.06m   82      11.6    None
20729   8/01/2015   11:44:36    45.06m   81      11.6    None
20729   8/01/2015   12:17:11    46.08m   79      11.6    None

示例csv数据如上所示。

 with open('output_file.csv','rb') as inf:
        incsv = csv.reader(inf)
        row = next(incsv)
        for row in incsv:
            date.append(row[1])
            time.append(row[2])
        print('min date {} max date {} min time {} max time {}'.format(min(date),max(date),min(time),max(time)))

我有日期和时间列的最小值和最大值。我想将两列转换为相对时间列,其中包含从00:00:00开始的相对值 - xx:xx:xx

我该怎么做?

1 个答案:

答案 0 :(得分:2)

对于名为output_file.csv的CSV输入文件:

Layer,      date,       time,        Ht,    Stat,      Vg,     Temp
57986,   8/01/2015,   13:53:05,    0.00m,    87,      15.4,    None
20729,   8/01/2015,   11:23:21,    45.06m,   82,      11.6,    None
20729,   8/01/2015,   11:44:36,    45.06m,   81,      11.6,    None
20729,   8/01/2015,   12:17:11,    46.08m,   79,      11.6,    None

这个程序:

import csv
import datetime

min_date = None

row_list = []
date_list = []

with open('output_file.csv', 'rb') as inf:
    incsv = csv.reader(inf)
    row = next(incsv)
    for row in incsv:
        # Create a time string
        time_str = row[1].strip() + " " + row[2].strip()
        # Convert the time string to a datetime
        cur_date = datetime.datetime.strptime(time_str, "%m/%d/%Y %H:%M:%S")
        # Update the min_date
        if min_date is None:
            min_date = cur_date
        else:
            min_date = min(min_date, cur_date)
        # Append the datetime to the list
        date_list.append(cur_date)
        # Get a copy of the row with whitespace removed
        new_row = [ col.strip() for col in row]
        # Get a copy of the row with the date and time replaced by a 
        # placeholder
        new_row = [ new_row[0], "" ] + new_row[3:]
        # Append the row to the list
        row_list.append(new_row)

index = 0
# For each datetime in the list
for cur_date in date_list:
    # Calculate the time delta
    delta_date = cur_date - min_date
    # Store it in the placeholder
    row_list[index][1] = str(delta_date)
    index += 1

for row in row_list:
    print "%s" % row

生成此输出:

['57986', '2:29:44', '0.00m', '87', '15.4', 'None']
['20729', '0:00:00', '45.06m', '82', '11.6', 'None']
['20729', '0:21:15', '45.06m', '81', '11.6', 'None']
['20729', '0:53:50', '46.08m', '79', '11.6', 'None']

您可能需要对其进行修改以准确生成您想要的内容。