我在csv文件中有一系列数据如下。
Hour L Dr Tag L 0 1 2 3 4 5 6 7 8 9 10
0 L5 XI PS 4R 6 3 6 6 5 6 1 9 11 2
0 L5 XI PS 4R 5 8 10 7 7 8 3 9 5 8
1 L0 St v2T 4R 1 0 0 0 0 0 0 0 0 6
1 L2 TI sst 4R 8 8 8 8 8 8 8 8 8 8
第一行表示列标题。标题为L
的列右侧的数据将从0 - 59
开始按顺序编号,其中只显示直到9的数据。
正如您所看到的,数据是根据小时列进行排序的,即hour 0 followed by hour1
我想将此更改为将hour rows
添加为小时“1及以上”作为小时0行末尾的列。它应该搜索行0
的标记字段,并在结尾处将60个值更新为新列。应更新列标题以表示hour.minute
(例如0.0,0.1 .....,1.0,1.1 ......)
如果您为hour 0
不存在的新标记进行了加密,则应添加该标记,并且只应更新该小时的60个值。所有其他值应设置为0'
我试图在python中执行上述操作。作为第一步,我试图检测小时是否有变化,一旦我这样做,我计划编写代码以小时n
读取所有记录并将分钟值合并到右侧基于Tag.Is我的方法是否正确?或者有人可以提出更好的方法吗?
import csv
import os
import sys
from glob import glob
hour = 0
p_hour = -1
c_hour = -1
rownum = 0
row_header = []
file_list = []
def format_minute_field(row_header,hour):
hdr_len = len(row_header)
for i in range((hdr_len-60),hdr_len):
row_header[i] = '{}:{}'.format(hour,row_header[i])
return row_header
if __name__ == '__main__':
fd = open('test.csv','rt')
rownum = 0
reader = csv.reader(fd)
for row in reader:
if rownum == 0:
row_header = row
row_header = format_minute_field(row_header,0)
print('row_header {}'.format(row_header))
rownum +=1
else:
if rownum == 1:
previous_row = row
if (row[0] != previous_row[0]) and (rownum > 1):
print('hour changed from {} to {}'.format(previous_row[0],row[0]))
previous_row = row
rownum +=1
预期输出如下。如果小时0列表和小时1列表中存在特定tag
,则应按如下方式记录值。如果在处理hour 1
recoreds时遇到新标记,则应将其附加到列表中。
Hour L Dr Tag L 0 0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9 0:10 ..........0:59 1:0 1:1 1:2 1:3
答案 0 :(得分:1)
仅仅因为我觉得它很有趣,我制作了一些代码。但是,您的测试数据与您解释的内容并不完全匹配,尤其是两个第一个数据行具有相同的标记,并且不清楚该怎么做。不过,这里有代码可以解决您的需求。希望它有所帮助。
我们的想法是使用DictReader和DictWriter来管理具有列名的单元格,而不是关心它们的读取/写入顺序,直到确实需要为止。然后,我使用data
字典,帮助我根据任意键将行合并在一起,这是由某些特定单元格的值定义的,使用元组可以用作dicts中的键的事实。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import csv
import sys
# This will store the lines by tag, in order to join them
data = {}
# This will tell us how to order the columns, will be extended later
columns = ['L', 'Dr', 'Tag']
# Controls extension of the columns
maxhour = 0
# Controls the order the keys are found in the original CSV. Maybe not necessary
keys = []
with open("in.csv", 'r', newline='') as fin:
reader = csv.DictReader(fin)
for row in reader:
hour = int(row['Hour'])
# Form a unique key to match lines. Adjust to your needs
key = (row['L'], row['Dr'], row['Tag'])
if key not in data:
# This is a future row, a dict with column as key, cell as value
data[key] = {'L': row['L'], 'Dr': row['Dr'], 'Tag': row['Tag']}
# Remember the order we've seen the keys
keys.append(key)
# Now, add data to the row for each minutes
# 1 to 59
for minute in range(1,60):
# Copy data from column 'minute' to column 'Hour:minute'
src_colname = str(minute)
dest_colname = row['Hour'] + ':' + src_colname
data[key][dest_colname] = row[src_colname]
# There seems to be a special treatment for minute 0, at column "L 0"
if hour == 0:
data[key]['L 0'] = row['L 0']
else:
data[key][row['Hour'] + ':0'] = row['L 0']
# Plan to generate enough columns when writing resulting file
maxhour = max(maxhour, hour)
with open("out.csv", 'w', newline='') as fout:
# Okay, now everything was merged into data
# We need to tell DictWriter how to order columns
# Treat special first column
columns.append('L 0')
# Then add the rest
for hour in range(0, maxhour+1):
# Do not include "0:0"
for minute in range(0 if hour > 0 else 1,60):
columns.append('{:d}:{:d}'.format(hour, minute))
# Let's write that
writer = csv.DictWriter(fout, columns, restval = "0")
writer.writeheader()
for key in keys: # or for key in data.keys(): if you don't mind the order
writer.writerow(data[key])
测试数据in.csv
:
Hour,L,Dr,Tag,L 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59
0,L5,XI,PS,4R,6,3,6,6,5,6,1,9,11,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,L5,XI,PS,4R,5,8,10,7,7,8,3,9,5,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,L0,St,v2T,4R,1,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,L2,TI,sst,4R,8,8,8,8,8,8,8,8,8,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,L5,XI,PS,4R,8,8,8,8,8,8,8,8,8,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
输出:
L,Dr,Tag,L 0,0:1,0:2,0:3,0:4,0:5,0:6,0:7,0:8,0:9,0:10,0:11,0:12,0:13,0:14,0:15,0:16,0:17,0:18,0:19,0:20,0:21,0:22,0:23,0:24,0:25,0:26,0:27,0:28,0:29,0:30,0:31,0:32,0:33,0:34,0:35,0:36,0:37,0:38,0:39,0:40,0:41,0:42,0:43,0:44,0:45,0:46,0:47,0:48,0:49,0:50,0:51,0:52,0:53,0:54,0:55,0:56,0:57,0:58,0:59,1:0,1:1,1:2,1:3,1:4,1:5,1:6,1:7,1:8,1:9,1:10,1:11,1:12,1:13,1:14,1:15,1:16,1:17,1:18,1:19,1:20,1:21,1:22,1:23,1:24,1:25,1:26,1:27,1:28,1:29,1:30,1:31,1:32,1:33,1:34,1:35,1:36,1:37,1:38,1:39,1:40,1:41,1:42,1:43,1:44,1:45,1:46,1:47,1:48,1:49,1:50,1:51,1:52,1:53,1:54,1:55,1:56,1:57,1:58,1:59
L5,XI,PS,4R,5,8,10,7,7,8,3,9,5,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4R,8,8,8,8,8,8,8,8,8,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
L0,St,v2T,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4R,1,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
L2,TI,sst,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4R,8,8,8,8,8,8,8,8,8,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0