注释

Question

我有一个shell命令，该命令可以解析某些内容并提供所需的输出。我需要在python中实现此功能，但是shell命令具有换行符"\n"，在通过python命令运行时不会得到执行。

在输出日志的许多行中，所需的行看起来像--configurationFile=/app/log/conf/the_jvm_name.4021.logback.xml

从上面我只需要 the_jvm_name 。语法将始终相同。 shell命令可以正常工作。

Shell命令-

ps -ef | grep 12345 | tr " " "\n" | grep logback.configurationFile | awk -F"/" '{print $NF}'| cut -d. -f1

Python（转义所有必需的双引号）-

import subprocess
pid_arr = "12345"
sh_command = "ps -ef | grep "+pid_arr+" | tr \" \" \"\n\" | grep configurationFile | awk -F \"/\" '{print $NF}' | cut -d. -f1"
outpt = subprocess.Popen(sh_command , shell=True,stdout=subprocess.PIPE).communicate()[0].decode('utf-8').strip()

使用python时，无法获得所需的输出。它只是按照命令中的说明打印configurationFile。我在这里想念什么。还有其他更好的方法来获取此详细信息吗？

Answer 1

您可以在Python中使用正则表达式替换来实现所需的目标

output = subprocess.check_output(["ps", "-ef"])
for line in output.splitlines():
  if re.search("12345", line):
    output = re.sub(r".*configurationFile=.*/([^.]+).*", r"\1", line)

这将捕获配置文件路径中最后一个/之后到下一个.之后的部分。

通过仅检查12345的第二列（PID），或者通过在空白处分割每一行，可以使其更加健壮：

cols = re.split("\s+", line) 
if len(cols) > 1 and cols[1] == "12345":

或使用更好的正则表达式，例如：

if re.match(r"\S+\s+12345\s", line):

请注意，您还可以通过执行以下操作来缩短管道的数量：

ps -ef | sed -nE '/12345/ { s/.*configurationFile=.*\/([^.]*).*/\1/; p }'

Answer 2

您的shell命令有效，但是它必须处理太多行的输出和每行太多的字段。一个更简单的解决方案是告诉ps命令只给您一行，而在那一行上，您只关心一个字段。例如，在我的系统上：

ps -o cmd h 979

将输出：

/usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3

-o cmd标志将仅输出输出的 CMD 列，而h参数表示告诉ps省略标题的命令。最后，979是进程ID，它告诉ps仅为此进程输出信息。

此输出不完全是您遇到的问题，但足够相似。一旦我们限制了输出，就不再需要其他命令，例如grep，awk，……。此时，我们可以使用正则表达式提取所需内容：

from __future__ import print_function
import re
import subprocess

pid = '979'
command = ['ps', '-o', 'cmd', 'h', pid]
output = subprocess.check_output(command)

pattern = re.compile(r"""
    config-file=  # Literal string search
    .+\/          # Everything up to the last forward slash
    ([^.]+)       # Non-dot chars, this is what we want
""", re.VERBOSE)

matched = pattern.search(output)

if matched:
    print(matched.group(1))

注释

对于正则表达式，我使用的是详细形式，允许我使用注释来注释我的模式。我喜欢这种方式，因为正则表达式可能很难阅读
在您的系统上，请调整“配置文件” 部分以使用您的输出。

在Python Shell中执行awk

2 个答案:

注释