Python中的中位数,模式,意思

时间:2017-04-17 09:44:38

标签: python regex statistics

我试图将匹配的正则表达式加载到列表中,然后计算中位数,模式和均值。

数据文件(pc1.txt):

2017-04-16 13:32:59 
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.05614841124945
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.50960924380334

2017-04-16 13:33:05
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.08875159384721
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.5102938969471

2017-04-16 13:33:10
\\desktop-XXXXXXX\processor(_total)\% processor time : 0
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.46869437193207

BootTime 200938

 ------------------------------------ 
 ------------------------------------ 

2017-04-16 13:40:11 
\\desktop-XXXXXXX\processor(_total)\% processor time : 4.37510327488846
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.438387242009

2017-04-16 13:40:17
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.90625777477218
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.44426156598249

2017-04-16 13:40:22
\\desktop-XXXXXXX\processor(_total)\% processor time : 0.078229917076289
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.44589104046464

BootTime 69920

正则表达式找到值:

Processor: ^[\\].+processor.+[: ](\d*\.?\d*)
Memory: ^[\\].+memory.+[: ](\d*\.?\d*)
Boottime: ^BootTime.(\d+)

到目前为止,我试图:

with open('pc1.txt') as f:
    for line in f:
        re.findall(processor, f)

但是,我无法1)匹配值; 2)把它列入清单; 3)计算中位数,模式和均值。

我有基本的知识如何计算模式:

from statistics import mode
mode([value1, value2])

但我仍然无法将所有部分组合在一起。此外,我还可以使用任何其他可以轻松/简单方式处理统计信息的编程语言。

1 个答案:

答案 0 :(得分:0)

文本文件中的匹配项为strings而不是floats,我们需要转换它们才能获得medianmean,我使用{{1}而不是numpy,即:

statistics