Python 3 - 阅读和阅读写一下&到.csv

时间:2016-06-22 11:43:24

标签: python csv

我正在尝试从CSV文件(A)中读取数据,提取数据并将其写入不同的CSV文件(B)。在新文件B中,我想要有两行。第一行应包含所有预定义变量,第2行应填充属于第1行中特定变量的所有值。

(A)Python代码

import csv
from collections import defaultdict

data = defaultdict(str)

#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out"]

#Read the GaitRite .csv
with open('C:/Users/Kay_v/Documents/School/Exports/Export 3.csv', 'r')  as f, open('C:/Users/Kay_v/Documents/School/Exports/result.csv', 'w') as outfile: 
    reader = csv.reader(f, delimiter=';')
    next(reader, None)  # skip the headers
    writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
    writer.writeheader()

#Read the .csv row by row
    for row in reader:
        #print(row)
        for item in definition:
            h = item.replace('_', '')
            r0 = row[0].lower().replace(' ', '')
            if h in r0:
                try:
                    avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
            except ValueError:
                avg = 0  # for cases with entry strings or commas
                #print(avg)
                print(h, r0, row[1], row[2])
                data[item] = row[1]

    data['record_id'] = 1

# Write the clean result.csv
    writer.writerow(data)

(B)期望的输出

record_id  abbreviation  study_id  step_count  distance 
1                                  3           292,34

在" Moses Koledoye"的帮助下我设法将大多数变量读入并写入干净的.csv文件。我遇到的问题如下:

问题(1)

在result.csv(clean .csv输出文件)中,我缺少以下变量的值:

step_time
step_length
cycle_time  
stride_length   
hh_base_support 
swing_time  
stance_time 
single_supp_time    
double_supp_time    
toe_in_out

我使用这部分代码来检查结果:

print(h, r0, row[1], row[2])

给了我以下信息:

stepcount stepcount 3  
distance distance 292,34  
ambulationtime ambulationtime 1,67  
velocity velocity 175,1  
cadence cadence 107,8  
velocity normalizedvelocity ,  
normalizedvelocity normalizedvelocity ,  
steptimedifferential steptimedifferential 0,004  
steptime steptimedifferential 0,004  
steplengthdifferential steplengthdifferential 1,051  
steplength steplengthdifferential 1,051  
cycletimedifferential cycletimedifferential 0,008  
cycletime cycletimedifferential 0,008  
steptime steptime(sec) 0,558 0,554
steplength steplength(cm) 96,746 97,797
stepextremity stepextremity(ratio) , ,
cycletime cycletime(sec) 1,116 1,108
stridelength stridelength(cm) 192,159 197,122
hhbasesupport hhbasesupport(cm) 2,988 6,32
swingtime swingtime(sec) 0,466 0,466
stancetime stancetime(sec) 0,65 0,642
velocity stridevelocity 172,185 177,908
steptime steptimestddev , 0,006
stridelength stridelengthstddev , ,
swingtime swingtimestddev , ,
stancetime stancetimestddev , ,
velocity stridevelocitystddev , ,
singlesupptime singlesupptimestddev , ,
doublesupptime doublesupptimestddev , ,

从上面的输出中,您可以看到名称与多个字符串匹配(如速度)和一些根本不匹配的问题(如toe_in_out)

问题(2)

另一个问题是将平均值包含在result.csv中。 每当变量有两个值时,我就会使用这部分代码来计算平均值。但我希望这个平均值也包含在输出result.csv。

try:
    avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
    avg = 0  # for cases with entry strings or commas

我希望有人能帮助我解决这些问题,我们将非常感谢!

如果您愿意,可以使用导出文件进行播放,您可以在此处下载: CSV export file

0 个答案:

没有答案