我正在尝试从CSV文件(A)中读取数据,提取数据并将其写入不同的CSV文件(B)。在新文件B中,我想要有两行。第一行应包含所有预定义变量,第2行应填充属于第1行中特定变量的所有值。
(A)Python代码
import csv
from collections import defaultdict
data = defaultdict(str)
#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out"]
#Read the GaitRite .csv
with open('C:/Users/Kay_v/Documents/School/Exports/Export 3.csv', 'r') as f, open('C:/Users/Kay_v/Documents/School/Exports/result.csv', 'w') as outfile:
reader = csv.reader(f, delimiter=';')
next(reader, None) # skip the headers
writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
writer.writeheader()
#Read the .csv row by row
for row in reader:
#print(row)
for item in definition:
h = item.replace('_', '')
r0 = row[0].lower().replace(' ', '')
if h in r0:
try:
avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
avg = 0 # for cases with entry strings or commas
#print(avg)
print(h, r0, row[1], row[2])
data[item] = row[1]
data['record_id'] = 1
# Write the clean result.csv
writer.writerow(data)
(B)期望的输出
record_id abbreviation study_id step_count distance
1 3 292,34
在" Moses Koledoye"的帮助下我设法将大多数变量读入并写入干净的.csv文件。我遇到的问题如下:
问题(1)
在result.csv(clean .csv输出文件)中,我缺少以下变量的值:
step_time
step_length
cycle_time
stride_length
hh_base_support
swing_time
stance_time
single_supp_time
double_supp_time
toe_in_out
我使用这部分代码来检查结果:
print(h, r0, row[1], row[2])
给了我以下信息:
stepcount stepcount 3
distance distance 292,34
ambulationtime ambulationtime 1,67
velocity velocity 175,1
cadence cadence 107,8
velocity normalizedvelocity ,
normalizedvelocity normalizedvelocity ,
steptimedifferential steptimedifferential 0,004
steptime steptimedifferential 0,004
steplengthdifferential steplengthdifferential 1,051
steplength steplengthdifferential 1,051
cycletimedifferential cycletimedifferential 0,008
cycletime cycletimedifferential 0,008
steptime steptime(sec) 0,558 0,554
steplength steplength(cm) 96,746 97,797
stepextremity stepextremity(ratio) , ,
cycletime cycletime(sec) 1,116 1,108
stridelength stridelength(cm) 192,159 197,122
hhbasesupport hhbasesupport(cm) 2,988 6,32
swingtime swingtime(sec) 0,466 0,466
stancetime stancetime(sec) 0,65 0,642
velocity stridevelocity 172,185 177,908
steptime steptimestddev , 0,006
stridelength stridelengthstddev , ,
swingtime swingtimestddev , ,
stancetime stancetimestddev , ,
velocity stridevelocitystddev , ,
singlesupptime singlesupptimestddev , ,
doublesupptime doublesupptimestddev , ,
从上面的输出中,您可以看到名称与多个字符串匹配(如速度)和一些根本不匹配的问题(如toe_in_out)
问题(2)
另一个问题是将平均值包含在result.csv中。 每当变量有两个值时,我就会使用这部分代码来计算平均值。但我希望这个平均值也包含在输出result.csv。
中try:
avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
avg = 0 # for cases with entry strings or commas
我希望有人能帮助我解决这些问题,我们将非常感谢!
如果您愿意,可以使用导出文件进行播放,您可以在此处下载: CSV export file