在文本文件中搜索号码?

时间:2017-06-01 13:59:41

标签: python regex text-processing text-parsing

我尝试从我的日志文件中获取一个数字。这个数字出现在每个“当前商店使用”之后。我怎样才能做到这一点?我可以使用re模块吗?

来自日志文件的行

2017-05-30 12:01:03,168 | WARN  | Store limit is 102400 mb (current store usage is 0 mb). The data directory: /opt/apache-activemq-5.12.0/bin/linux-x86-64/../../data only has 6887 mb of usable space - resetting to maximum available disk space: 6887 mb | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain

我的代码

def log_parser():
    palab2 = "WARN"
    logfile = open("/opt/apache-activemq-5.12.0/data/activemq.log", "r")
    contenlog = logfile.readlines()
    logfile.close()
    for ligne in contenlog:
        if palab2 in ligne:
            print ("Probleme : " + ligne)

3 个答案:

答案 0 :(得分:1)

这对你有用:

import re
ligne  = '2017-05-30 12:01:03,168 | WARN | Store limit is 102400 mb (current store usage is 0 mb). The data directory: /opt/apache-activemq-5.12.0/bin/linux-x86-64/../../data only has 6887 mb of usable space - resetting to maximum available disk space: 6887 mb | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain'
print(re.search(r'current store usage is (\d+)', ligne).group(1))
# this returns a 'string', you can convert it to 'int'

输出:

'0'

快乐的编码!!!

答案 1 :(得分:0)

是的,您可以使用re模块来大大简化此操作。并且+1到@Eric Duminil以便建议不要一次读取整个文件。

import re

def log_parser():
    palab2 = "WARN"
    logfile = "/opt/apache-activemq-5.12.0/data/activemq.log"

    with open(logfile, 'r') as contenlog:
        for ligne in contenlog:
            if re.findall(palab2, ligne):
                print ("Probleme : " + ligne)
                break

答案 2 :(得分:0)

试试这个:

import re

def log_parser():
    with open("/opt/apache-activemq-5.12.0/data/activemq.log", "r") as logfile:
        for line in logfile:
            m = re.search(r"current store usage is (\d+)", line):
                if m:
                    return m.group(1)

print(log_parser())

您没有指定是否只需要第一次出现(我假设是这样)或文件中的所有这些行。如果后者属实,只需将return更改为yield,然后按以下方式调用函数:print(list(log_parser()))