Perl正则表达式在引号内引用带引号的字符串

时间:2012-01-02 20:43:05

标签: regex perl escaping

我正在编写一个命令行解释器,但我对这种类型的输入都是空白的:

command -text "hello this is "some text" with "quotes inside"" -other "another thing""" -another " -another "text"

我需要转义引号,然后将字符串输入到我的解析器中。 我的想法是什么:

/".+"/但它从第一个引用开始。

你有什么见解吗?

编辑: 我想要的是什么:

输入:command -text "hello this is "some text" with "quotes inside"" -other "another thing""" -another " -another "text"

输出:command -text "hello this is \"some text\" with \"quotes inside\"" -other "another thing\"\"" -another " -another \"text"

2 个答案:

答案 0 :(得分:4)

你正走上一条糟糕的道路,IMO。您需要一些明确的方法来分隔参数。如果你想在参数中允许混乱,你需要其他东西才能带来秩序。

换句话说,你可以使用像

这样的东西
-label1 [random characters] -nextlabel [random characters]

但这仍然意味着您无法在某些组合中使用短划线-,因为使用-text "some random -text"会破坏它。

听起来我觉得你需要一个万无一失的解决方案来补偿那些不知道自己在做什么的用户。但是,自动化会让你错误。只需编写一个气密的命令解析器,并在字符串被错误引用时给出错误。让用户更正他们的输入,而不是你。

答案 1 :(得分:0)

好吧,我有一个非常丑陋的解决方案......它处理所有内容但在字符串内破灭。

希望你的眼睛不受伤......

fg@erwin ~ $ perl -ne 'my @l; foreach (split /-/) { my ($start, $middle, $end) = m/^([^"]+)"(.*)"([^"]*)$/ or do { push @l, $_; next; }; $middle =~ s/"/\\"/g;push @l, "$start\"$middle\"$end"; } END { print join("-", @l) . "\n";}' <<EOF
> command -text "hello this is "some text" with "quotes inside"" -other "another thing""" -another " -another "text"
> EOF
command -text "hello this is \"some text\" with \"quotes inside\"" -other "another thing\"\"" -another " -another "text"

稍微解释一下......

my @l;  # declare a new list l, which will contain the "transformed" strings
foreach (split /-/) {  # split against the dash -- no loop variable: $_ has the content
    # Try and capture in start, middle and end:
    # * start: from the beginning until the first quote;
    # * middle: everything but the last quote and whatever is after it;
    # * end: what is after the last quote
    # By default, m// applies to $_
    my ($start, $middle, $end) = m/^([^"]+)"(.*)"([^"]*)$/
        or do { # if no match...
            push @l, $_;  # put the line as is,
            next;         # and continue
        };
    $middle =~ s/"/\\"/g; # in the middle part, precede all quotes with a \
    push @l, "$start\"$middle\"$end"; # and push the final string onto the list
}

# finally, print all transformed lines joined with a dash, plus a newline
END { print join("-", @l) . "\n";}