使用bash / python解析文本

时间:2013-11-17 01:33:40

标签: python bash parsing

我有来自FlexLM(lmstat -a)的以下输出:

Users of server:  (Total of 5 licenses issued;  Total of 4 licenses in use)

"feature1" v9.0, vendor: klocwork
floating license

tyoung host01 /dev/tty (v9.0) (flex1.com/27000 57756), start Mon 3/21 9:06       (linger: 1209600)
jfall host02 /dev/tty (v9.0) (flex1.com/27000 17731), start Fri 3/18 12:54 (linger: 1209600)
jfall host03 /dev/pts/1 (v9.0) (flex1.com/27000 29438), start Thu 3/24 9:33 (linger: 1209600)
jfall host04 /dev/tty (v9.0) (flex1.com/27000 12791), start Thu 3/24 13:39 (linger: 1209600)

Users of client:  (Total of 10 licenses issued;  Total of 5 licenses in use)

"feature2" v9.0, vendor: klocwork
floating license

jfall host04 /dev/tty (v9.0) (flex1.com/27000 127), start Thu 3/24 13:39 (linger: 1209600)

我希望使用类似

获得输出
jfall feature1 17731
jfall feature1 29438
jfall feature1 12791
jfall feature2 127

我有一些偏好使用bash和/或Python。请注意,此输出可以随时更改,有些用户可以查看feature1或feature2。这就是为什么我很难找到一个解析模式的原因。

感谢任何帮助

现状:

$USER="jfall"
$LOGFILE="mylog.log"
HANDLE=(`cat $LOGFILE | grep $USER | awk '{print $6}' | tr -d '),'`)
HANDLE_LENGHT=${#HANDLE[@]}
for ((i=0; i<${HANDLE_LENGHT};i++))
    do
            echo "$USER ${HANDLE[$i]}"
    done

输出:

jfall 17731
jfall 29438
jfall 12791
jfall 127

但我不知道如何获得为每一行分配的功能列表。我的第一个想法是在每个结果上方以“”开头返回模式,但我不确定如何实现它

1 个答案:

答案 0 :(得分:3)

好的,所以你需要做的是将你所看到的共性分开。

我看到有一行指定User of...,下一行包含内容的功能名称包含在引号中:

#string = the example you gave
import re
sections = [x.strip() for x in re.split(r'Users of.*',string) if x != '']

其余的将在一个循环中完成。假设其余部分在for section in sections

之内

现在我们需要获得每个部分的标题:

title = section.split('\n')[0].split('"')[1]
#get first line, get everything between first quotation marks

现在您需要分析每一行以获取您需要的名称和数字:

for line in section.split('/n'):
    t = re.match(r'(.*?)\s.*?\(flex1.com/\d+? (\d+)?\)',line)

你想要的最后一件事是只返回一个特定的起始名称:

the_name = "jfall"
if t != None:
    final = [x[0] + " " + title + " " + x[1] for x in t.groups() if x[0] == the_name]

将所有结果保存在名为result

的最终数组中
result += final

完全编译循环:

import re
sections = [x.strip() for x in re.split(r'Users of.*',string) if x != '']
result = []
the_name = "jfall"
for section in sections:
    lines = section.split('\n') 
    title = lines[0].split('"')[1]
    for line in lines[1:]:
        t = re.match(r'(.*?)\s.*?\(flex1.com/\d+? (\d+)?\)',line)
        if t != None:
            final = [x[0] + " " + title + " " + x[1] for x in t.groups() if x[0] == the_name]
            result += final

现在我跑了这个就是它打印的内容:

>>> for i in result:
...     print i
... 
jfall feature1 17731
jfall feature1 29438
jfall feature1 12791
jfall feature2 127

如果有什么不清楚,请告诉我!!