我目前正在努力使用Python中的正则表达式进行过滤。我正在通过ssh执行命令,我正在stdout中捕获它。一切顺利,但困难的部分来了。 stdout中加载的文件的输出如下:
命令执行成功。 server.jvm.memory.maxheapsize-count-count = 518979584
命令执行成功。 server.jvm.memory.maxheapsize-count-count = 518979584
(这多次)。我要执行正则表达式:
stdin, stdout, stderr = ssh.exec_command('cat ~/Desktop/jvm.log')
result = stdout.readlines()
result = "".join(result)
print(result)
line = re.compile(r'\d+\n')
rline = "".join(line.findall(result))
print(rline)
打印(rline)结果为
>> 518979584
>> 518979584
>> 518979584
(也是多次)。我只想打印一次。通过打印rline [0]我只得到整个数字的第一个数字。我想过使用$,但这对任何人都无济于事?
答案 0 :(得分:2)
这应该会给你你想要的东西。
(\d+)\D*$
只需进行搜索,这将为您提供最后发生的号码。
>>> regex = re.compile(r"(\d+)\D*$")
>>> string = "100 20gdg0 3gdfgd00gfgd 400"
>>> r = regex.search(string)
# List the groups found
>>> r.groups()
(u'400',)
答案 1 :(得分:1)
你的行:
rline = "".join(line.findall(result))
将从findall
返回的列表转换为字符串,然后导致rline[0]
返回字符串中的第一个字符。
只需从line.findall(result)[0]
如下例所示
>>> d = '''
Command get executed successfully. server.jvm.memory.maxheapsize-count-count = 518979584
...
... Command get executed successfully. server.jvm.memory.maxheapsize-count-count = 518979584
... '''
>>> d
'\n\n Command get executed successfully. server.jvm.memory.maxheapsize-count-count = 518979584\n\n Command get executed successfully. server.jvm.memory.maxheapsize-count-count = 518979584\n'
>>> import re
>>> line = re.compile(r'\d+\n')
>>> rline = "".join(line.findall(d))
>>> rline
'518979584\n518979584\n'
>>> line.findall(d)
['518979584\n', '518979584\n']
>>> line.findall(d)[0].strip() # strip() used to remove newline character - may not be needed
'518979584'
答案 2 :(得分:0)
set()
提供唯一性
with open(<your file name>) as in_file:
counts = set(line.rpartition(' ')[2] for line in in_file)