我正在构建一个脚本来从服务器日志中获取数据。数据以下列格式显示,显示时间戳和出现频率。
ggplot(mtcars, aes(x=disp, y=mpg, color=factor(am))) +
theme_bw() +
geom_point() +
geom_smooth(method = 'lm', se=FALSE) +
geom_abline(aes(intercept=40, slope = (-1/10), fill='Comparison Line 1'), show.legend = TRUE) +
geom_abline(aes(intercept=25, slope = (-1/30), fill='Comparison Line 2'), show.legend = TRUE)
我正在尝试创建一个仅显示破折号之后的数字的列表:
20:52:37 - 3
20:52:38 - 8
20:52:39 - 28
20:52:40 - 58
20:52:41 - 59
20:52:42 - 51
20:52:43 - 37
20:52:44 - 22
20:52:45 - 4
20:52:47 - 14
20:52:48 - 15
20:52:49 - 12
20:52:50 - 4
20:52:51 - 5
20:52:52 - 12
20:52:53 - 5
我尝试拆分输出,然后只添加所需的元素但仍然遇到错误。尝试拆分破折号和新行代码,然后只需为每个数字添加正确的位置:
[3,8,28,etc.,etc.]
答案 0 :(得分:1)
您可以使用re.findall
:
import re
s = """
20:52:37 - 3
20:52:38 - 8
20:52:39 - 28
20:52:40 - 58
20:52:41 - 59
....
"""
data = map(int, re.findall('(?<=\s-\s)\d+', s))
输出:
[3, 8, 28, 58, 59...]
答案 1 :(得分:0)
要删除尾随换行符,您可以使用rstrip():
res = []
with open('server.log') as f:
lines = (line.rstrip() for line in f) # to remove trailing newlines
lines = (line for line in lines if line) # to remove blank lines
res = [int(line.split(' - ')[-1]) for line in lines]
<强>输出:强>
>>> res
[3, 8, 28, 58, 59, 51, 37, 22, 4, 14, 15, 12, 4, 5, 12, 5]