我有一个包含CPU统计信息的文本文件,如下所示(来自sar / sysstat)
17:30:38 CPU %user %nice %system %iowait %steal %idle
17:32:49 all 14.56 2.71 3.79 0.00 0.00 78.94
17:42:49 all 12.68 2.69 3.44 0.00 0.00 81.19
17:52:49 all 12.14 2.67 3.22 0.01 0.00 81.96
18:02:49 all 12.28 2.67 3.20 0.03 0.00 81.82
我的目标是为每个列构建列表(除了CPU,%nice和%steal),所以我可以使用bokeh绘制它们,所以试图将每一行拆分为列表然后我不知道你是怎么做的可以忽略某些值,即
#!/usr/bin/python
cpu_time = []
cpu_user = []
cpu_system = []
cpu_iowait = []
cpu_idle = []
with open('stats.txt') as F:
for line in F:
time, ignore, user, ignore, system, iowait, ignore, idle = line.split()
cpu_time.append(time)
cpu_user.append(user)
cpu_system.append(system)
cpu_iowait.append(iowait)
cpu_idle.append(idle)
有更好/更短的方法吗?更具体地说,我以前忽略某些项目的逻辑对我来说并不好看。
答案 0 :(得分:1)
这是一个更加动态的版本,可以扩展到更多列。但是你的实施并没有什么不好的。
# build a dict of column name -> list of column values
stats = {}
with open('stats.txt') as F:
header = None
for idx, line in enumerate(F):
# This is the header
if idx == 0:
# save the header for later use
header = line.split()
for word in header:
stats[word] = []
else:
# combine the header with the line to get a dict
line_dict = dict(zip(header, line.split()))
for key, val in line_dict.iteritems():
stats[key].append(val)
# remove keys we don't want
stats.pop('%nice')
stats.pop('%steal')
答案 1 :(得分:1)
这更通用一点。您可以定义所需列名称的列表。它使用csv-Dictreader来读取文件。名称不带%
后缀。此外,它将时间转换为模块datetime中的datetime.time
对象,将所有其他列转换为浮点数。您可以使用字典converters
为所有列指定自己的数据转换函数。
import csv
import datetime
def make_col_keys(fobj, col_names):
time_key = fobj.readline().split()[0]
cols = {'time': time_key}
cols.update({key: '%' + key for key in col_names})
fobj.seek(0)
return cols
def convert_time(time_string):
return datetime.datetime.strptime(time_string, '%H:%M:%S').time()
converters = {'time': convert_time}
def read_stats(file_name, col_names, converters=converters):
with open(file_name) as fobj:
cols = make_col_keys(fobj, col_names)
reader = csv.DictReader(fobj, delimiter=' ', skipinitialspace=True)
data = {}
for line in reader:
for new_key, old_key in cols.items():
value = converters.get(new_key, float)(line[old_key])
data.setdefault(new_key, []).append(value)
return data
def main(file_name, col_names=None):
if col_names is None:
col_names = ['user', 'system', 'iowait', 'idle']
return read_stats(file_name, col_names)
main('stats.txt')
结果:
{'idle': [78.94, 81.19, 81.96, 81.82],
'iowait': [0.0, 0.0, 0.01, 0.03],
'system': [3.79, 3.44, 3.22, 3.2],
'time': [datetime.time(17, 32, 49),
datetime.time(17, 42, 49),
datetime.time(17, 52, 49),
datetime.time(18, 2, 49)],
'user': [14.56, 12.68, 12.14, 12.28]}
答案 2 :(得分:0)
首先,您可以使用_
或__
来表示忽略的值(这是一种常见的约定)。
接下来,您可以将所有值存储到单个列表中,然后使用zip
将列表解压缩到多个列表中。
cpu_stats = []
with open('stats.txt') as stats_file:
for line in stats_file:
time, _, user, _, system, iowait, _, idle = line.split()
cpu_stats.append([time, user, system, iowait, idle])
cpu_time, cpu_user, cpu_system, cpu_iowait, cpu_idle = zip(*cpu_stats)
你可以使用一些列表推导来写这个,但我认为它不再具有可读性:
with open('stats.txt') as stats_file:
lines = (line.split() for line in stats_file)
cpu_stats = [
(time, user, system, iowait, idle)
for time, _, user, _, system, iowait, _, idle
in lines
]
cpu_time, cpu_user, cpu_system, cpu_iowait, cpu_idle = zip(*cpu_stats)