我目前正在为Python中的库编写一个命令解析器模块,它将采用相当复杂的bash管道,将它们拆分并解析各个段。
用于解析器的正则表达式并不复杂,但使用命名组:
/(?P<command>.*?)( ((?P<redirect>[&\d-]?)>+ ?&?(?P<filename>\S+)))( ?< ?(?P<infile>.*))?/g
在这种形式中,我一直在测试以下[设计]字符串:
sed 's/24/25/g' # Doesn't pass but not necessary
sed 's/24/25/g' &>/dev/null # Works
sed 's/24/25/g' 1>&2 # Works
sed 's/24/25/g' 2>&1 1>/dev/null # Works
sed 's/24/25/g' &>/dev/null < infile.txt # Works
grep -rin --col 'i < 24\|b>19' > /dev/null # Works
grep -rin --col 'i < 24\|b > 19' > /dev/null # Doesn't work
我并不关心如何匹配sed 's/24/25/g'
,就好像没有匹配我可以将整个字符串分配为command
,但最终{{1}因为所提供的命令以这种方式包含grep
符号是完全可行的。
问题:可以重写此正则表达式以包含最终示例而无需使用>
库。
示例:(python3)
pcre
输出
import re
import shlex
from collections import namedtuple
redirect_regex = re.compile(r"(?P<command>.*?)( (?P<redirect>[&\d]?)>+ ?&?(?P<filename>\S+))( ?< ?(?P<infile>.*))?", re.DOTALL)
command_list = [
"sed 's/24/25/g'",
"sed 's/24/25/g' &>/dev/null",
"sed 's/24/25/g' 1>&2",
"sed 's/24/25/g' 2>&1 1>/dev/null",
"sed 's/24/25/g' &>/dev/null < infile.txt",
"grep -rin --col 'i < 24\|b>19' > /dev/null",
"grep -rin --col 'i < 24\|b > 19' > /dev/null"
]
command_structure = namedtuple('CommandStructure', 'command arguments redirects')
redirect = namedtuple('Redirect', 'stdout stderr stdin')
commands = []
for command in command_list:
for com in command.split(' | '):
structure = None
matches = [match.groupdict() for match in redirect_regex.finditer(com)]
if len(matches) == 0:
structure = shlex.split(com)
commands.append(
command_structure(
command=structure[:1],
arguments=structure[1:],
redirects=None
)
)
else:
try:
structure = shlex.split(matches[0]['command'])
except ValueError as exception:
print('Failed to parse command "' + com + '"')
print(' reason was: ' + str(exception))
continue
structure_redirects = []
for match in matches:
stdout = match['filename'] if match['redirect'] in ['1', '', '&'] else None
stderr = match['filename'] if match['redirect'] in ['2', '&'] else None
stdin = match['infile'] if hasattr(match, 'infile') else None
structure_redirects.append(
redirect(stdout=stdout, stderr=stderr, stdin=stdin)
)
commands.append(
command_structure(
command=structure[:1],
arguments=structure[1:],
redirects=structure_redirects
)
)
print('--------------------------------')
for command in commands:
print(command)