Question

我目前正在努力使用Python中的正则表达式进行过滤。我正在通过ssh执行命令，我正在stdout中捕获它。一切顺利，但困难的部分来了。 stdout中加载的文件的输出如下：

命令执行成功。 server.jvm.memory.maxheapsize-count-count = 518979584

命令执行成功。   server.jvm.memory.maxheapsize-count-count = 518979584

（这多次）。我要执行正则表达式：

stdin, stdout, stderr = ssh.exec_command('cat ~/Desktop/jvm.log')
result = stdout.readlines()
result = "".join(result)
print(result)
line = re.compile(r'\d+\n')
rline = "".join(line.findall(result))
print(rline)

打印（rline）结果为

>> 518979584 

>> 518979584

>> 518979584

（也是多次）。我只想打印一次。通过打印rline [0]我只得到整个数字的第一个数字。我想过使用$，但这对任何人都无济于事？

Answer 1

这应该会给你你想要的东西。

(\d+)\D*$

只需进行搜索，这将为您提供最后发生的号码。

>>> regex = re.compile(r"(\d+)\D*$")
>>> string = "100 20gdg0 3gdfgd00gfgd 400"
>>> r = regex.search(string)
# List the groups found
>>> r.groups()
(u'400',)

Answer 2

你的行：

rline = "".join(line.findall(result))

将从findall返回的列表转换为字符串，然后导致rline[0]返回字符串中的第一个字符。

只需从line.findall(result)[0]

获取元素即可

如下例所示

>>> d = '''
     Command get executed successfully. server.jvm.memory.maxheapsize-count-count =     518979584
... 
...     Command get executed successfully. server.jvm.memory.maxheapsize-count-count = 518979584
... '''
>>> d
'\n\n    Command get executed successfully. server.jvm.memory.maxheapsize-count-count    = 518979584\n\n    Command get executed successfully.     server.jvm.memory.maxheapsize-count-count = 518979584\n'
>>> import re
>>> line = re.compile(r'\d+\n')
>>> rline = "".join(line.findall(d))
>>> rline
'518979584\n518979584\n'
>>> line.findall(d)
['518979584\n', '518979584\n']
>>> line.findall(d)[0].strip() # strip() used to remove newline character - may not be needed
'518979584'

Answer 3

混合shell和Python绝不是一个好主意 - 当你可以用Python做所有事情（就像你的情况一样）
不需要正则表达式

set()提供唯一性

with open(<your file name>) as in_file:
    counts = set(line.rpartition(' ')[2] for line in in_file)

Python Regex查找最后一次出现的数字

3 个答案: