我有几个ascii文件,看起来像这样 -
001, 12:04, ...., ...., ...., ....
001, 12:05, ...., ...., ...., ....
001, 12:06, ...., ...., ...., ....
002, 12:07, ...., ...., ...., ....
002, 12:08, ...., ...., ...., ....
002, 12:09, ...., ...., ...., ....
002, 12:10, ...., ...., ...., ....
002, 12:11, ...., ...., ...., ....
003, 12:12, ...., ...., ...., ....
003, 12:13, ...., ...., ...., ....
003, 12:14, ...., ...., ...., ....
003, 12:15, ...., ...., ...., ....
003, 12:16, ...., ...., ...., ....
003, 12:17, ...., ...., ...., ....
003, 12:18, ...., ...., ...., ....
003, 12:19, ...., ...., ...., ....
等等。我想要的是在更改为下一行之前获取行的第五个值。例如,这些行的第五个值等等。
001, 12:06, ...., ...., ...., ....
002, 12:11, ...., ...., ...., ....
003, 12:19, ...., ...., ...., ....
任何帮助请使用python,numpy?
答案 0 :(得分:3)
lines_as_string = """001, 12:04, ...., ...., ...., 1...
001, 12:05, ...., ...., ...., 2...
001, 12:06, ...., ...., 2..., 3...
002, 12:07, ...., ...., 1..., 1...
002, 12:08, ...., ...., 2..., 2...
002, 12:09, ...., ...., 3..., 3...
002, 12:10, ...., ...., 4..., 4...
002, 12:11, ...., ...., 5..., 5...
003, 12:12, ...., ...., 5..., 1...
003, 12:13, ...., ...., 1..., 2...
003, 12:14, ...., ...., 2..., 3...
003, 12:15, ...., ...., 3..., 4...
003, 12:16, ...., ...., 4..., 5...
003, 12:17, ...., ...., 5..., 6...
003, 12:18, ...., ...., 6..., 7...
003, 12:19, ...., ...., 7..., 8..."""
last_fives = [v for k,v in sorted({l[0]:l[5] for l in [a.split(', ') for a in lines_as_string.split('\n')]}.items())]
这就是将其分解为更具可读性的步骤。
lines = []
# split into rows
for line in lines_as_string.split('\n'):
#split into columns
columns = line.split(', ')
lines.append(columns)
last_lines = {}
# make a dict where row[0] -> row[5]
for row in lines:
# since they are ordered already,
# previous lines get overwritten
last_lines[row[0]] = row[5]
# create a list sorted by the original keys
last_fives = []
for k,v in sorted(last_lines.items()):
last_fives.append(v)