在Python3中解析Bash重定向

时间:2016-08-08 06:25:41

标签: python regex python-3.x

我目前正在为Python中的库编写一个命令解析器模块,它将采用相当复杂的bash管道,将它们拆分并解析各个段。

用于解析器的正则表达式并不复杂,但使用命名组:

/(?P<command>.*?)( ((?P<redirect>[&\d-]?)>+ ?&?(?P<filename>\S+)))( ?< ?(?P<infile>.*))?/g

在这种形式中,我一直在测试以下[设计]字符串:

sed 's/24/25/g'                            # Doesn't pass but not necessary
sed 's/24/25/g' &>/dev/null                # Works
sed 's/24/25/g' 1>&2                       # Works
sed 's/24/25/g' 2>&1 1>/dev/null           # Works
sed 's/24/25/g' &>/dev/null < infile.txt   # Works
grep -rin --col 'i < 24\|b>19' > /dev/null            # Works
grep -rin --col 'i < 24\|b > 19' > /dev/null          # Doesn't work

我并不关心如何匹配sed 's/24/25/g',就好像没有匹配我可以将整个字符串分配为command,但最终{{1}因为所提供的命令以这种方式包含grep符号是完全可行的。

问题:可以重写此正则表达式以包含最终示例而无需使用>

示例:(python3)

pcre

输出

import re
import shlex
from collections import namedtuple

redirect_regex = re.compile(r"(?P<command>.*?)( (?P<redirect>[&\d]?)>+ ?&?(?P<filename>\S+))( ?< ?(?P<infile>.*))?", re.DOTALL)
command_list = [
    "sed 's/24/25/g'",
    "sed 's/24/25/g' &>/dev/null",
    "sed 's/24/25/g' 1>&2",
    "sed 's/24/25/g' 2>&1 1>/dev/null",
    "sed 's/24/25/g' &>/dev/null < infile.txt",
    "grep -rin --col 'i < 24\|b>19' > /dev/null",
    "grep -rin --col 'i < 24\|b > 19' > /dev/null"
]

command_structure = namedtuple('CommandStructure', 'command arguments redirects')
redirect = namedtuple('Redirect', 'stdout stderr stdin')
commands = []

for command in command_list:
    for com in command.split(' | '):
        structure = None
        matches = [match.groupdict() for match in redirect_regex.finditer(com)]
        if len(matches) == 0:
            structure = shlex.split(com)
            commands.append(
                command_structure(
                    command=structure[:1],
                    arguments=structure[1:],
                    redirects=None
                )
            )
        else:
            try:
                structure = shlex.split(matches[0]['command'])
            except ValueError as exception:
                print('Failed to parse command "' + com + '"')
                print('    reason was: ' + str(exception))
                continue
            structure_redirects = []
            for match in matches:
                stdout = match['filename'] if match['redirect'] in ['1', '', '&'] else None
                stderr = match['filename'] if match['redirect'] in ['2', '&'] else None
                stdin = match['infile'] if hasattr(match, 'infile') else None
                structure_redirects.append(
                    redirect(stdout=stdout, stderr=stderr, stdin=stdin)
                )
            commands.append(
                command_structure(
                    command=structure[:1],
                    arguments=structure[1:],
                    redirects=structure_redirects
                )
            )

print('--------------------------------')
for command in commands:
    print(command)

0 个答案:

没有答案