Encode characters if not already encoded (with rawurlencode and preg_replace_callback)

时间:2017-11-08 22:08:02

标签: php regex url encoding character-encoding

I want to replace all characters in a string with their percent-encoding representation (%xy), but only the ones that are not already percent-encoded.

For example, in the string abc#%2Bdef, the %2B part is already a percent-encoded representation. So it should not be re-encoded. The correct result after encoding should be: abc%23%2Bdef.

This is what I have tried - but the result is still abc#%2Bdef:

// Pattern: read all characters except the percent-encoded ones (%xy).
$pattern = '/(?!%[a-fA-F0-9]{2})/';
$string = 'abc#%2Bdef';

$result = preg_replace_callback($pattern, function($matches) {
    return rawurlencode($matches[0]);
}, $string);

var_dump($result);

I think it's just the $patternvalue that should be changed, but I'm not sure. And with the current pattern the rawurlencode() inside the callback is not called.

Encoding legend: %23 -> #, %2B -> +

I tried many hours today to find the right pattern form. And it seemed very simple in the beginning... I really appreciate any advice or solution.

Thank you very much.

1 个答案:

答案 0 :(得分:2)

简单的方法是首先解码先前编码的字符,然后重新编码所有字符串。

$string = 'abc#%2Bdef';
$string = rawurlencode(rawurldecode($string));

这会给你预期的结果。

abc%23%2Bdef