我正在尝试从.mdb文件中提取表,然后过滤该表并将结果吐出到短.csv文件中。到目前为止,我能够提取所需的表并将其内容保存到.CSV中。但我不知道如何对数据进行排序并提取我需要的必要行。我想我可以保存整个.csv然后重新打开它,但由于我需要处理大约2000个mdb文件,因此需要大量空间。我只是想提取某些行。
Cycle Test_Time Current Voltage
1 7.80E-002 0.00E+000 1.21E-001
1 3.01E+001 0.00E+000 1.19E-001
1 6.02E+001 0.00E+000 1.17E-001
2 9.02E+001 0.00E+000 1.14E-001
2 1.20E+002 0.00E+000 1.11E-001
2 1.50E+002 0.00E+000 1.08E-001
2 1.80E+002 0.00E+000 1.05E-001
2 2.10E+002 0.00E+000 1.02E-001
3 2.40E+002 0.00E+000 9.93E-002
3 2.70E+002 0.00E+000 9.66E-002
3 3.00E+002 0.00E+000 9.38E-002
3 3.10E+002 4.00E-001 1.26E+000
例如,在上表中我想做以下事情:
这是我的代码:
import sys, subprocess, glob
mdbfiles = glob.glob('*.res')
for DATABASE in mdbfiles:
subprocess.call(["mdb-schema", DATABASE, "mysql"])
table_names = subprocess.Popen(["mdb-tables", "-1", DATABASE],
stdout=subprocess.PIPE).communicate()[0]
tables = table_names.splitlines()
sys.stdout.flush()
a=str('Channel_Normal_Table')
for table in tables:
if table != '' and table==a:
filename = DATABASE.replace(".res","") + ".csv"
file = open(filename, 'w')
print("Dumping " + table)
contents = subprocess.Popen(["mdb-export", DATABASE, table],
stdout=subprocess.PIPE).communicate()[0]
# I NEED TO PUT SOMETHING HERE TO SORT AND EXTRACT THE DATA I NEED
file.write(contents)
file.close()
答案 0 :(得分:1)
可能更容易不处理平面行列表但将其转换为结构,这样可以更容易地“查询”数据。像一个dicts列表,每个dict代表一个循环:
cycles = {}
rows = contents.splitlines() # split the `contents` text blob into individual lines
for row in rows[1:]: # the first line in your question is a header - [1:] skips it
row = rows.split() # split each line by whitespace
cycle = cycles.setdefault(row[0], {'id': row[0], 'rows': []}
cycle['rows'].append({'cycle':row[0], 'test_time': row[1], 'current': row[2], ...})
然后你可以通过test_time对它们进行排序:
for key, cycle in cycles.items():
cycles['rows'].sort(key=itemgetter('test_time'))
然后您可以处理您的数据。每个周期的最后一行:
for key, cycle in cycles.items():
output_row(cycles['rows'][-1])
过去五个周期的行:
for key, cycle in sorted(cycles.items())[:-5]:
output_rows(cycles['rows'])
从4到30提取行:
for idx in range(4, 31):
cycle = cycles[str(idx)]
output_rows(cycles['rows'])