PHP实现的javascript escape()和escape()

时间:2015-03-11 02:23:48

标签: javascript php encoding escaping

首先,我理解不推荐使用JS escape()unescape()。基本上我们有一个古老的系统JS escape()数据存储在DB之前,每次我们需要unescape()客户端数据才能显示实际数据(我知道它很愚蠢但是它是多年前完成以支持非符合Unicode标准的DB上的Unicode字符。

是否存在模拟JavaScript escape()escape()函数的现有PHP实现?

2 个答案:

答案 0 :(得分:0)

您正在寻找urlencode()。如果您无法接受该编码的输出,则可以尝试rawurlencode()。

这有更多信息:

http://php.net/manual/en/function.urldecode.php

http://php.net/manual/en/function.urlencode.php

但是如果您只想进行解码以将数据存储到mysql数据库中,那么您可以使用内置的mysql转义字符串函数将输入转换为可以注入mysql的合适输出格式数据库中。

见:

http://php.net/manual/en/mysqli.real-escape-string.php

答案 1 :(得分:0)

经过一番搜索后,我能够将两个PHP函数放在一起,完成我想要的工作。这些代码并不漂亮,但对我们迄今为止所拥有的数据100%有效,所以我想在这里分享它们。

/**
 *  Simulate javascript escape() function
 */
function escapejs($source) {
    $map = array(           
      ,'~'        => '%7E'
      ,'!'        => '%21'
      ,'\''       => '%27'       // single quote
      ,'('        => '%28'
      ,')'        => '%29'
      ,'#'        => '%23'
      ,'$'        => '%24'
      ,'&'        => '%26'
      ,','        => '%2C'
      ,':'        => '%3A'
      ,';'        => '%3B'
      ,'='        => '%3D'
      ,'?'        => '%3F'
      ,' '       => '%20'       // space
      ,'"'        => '%22'       // double quote
      ,'%'        => '%25'
      ,'<'        => '%3C'
      ,'>'        => '%3E'
      ,'['        => '%5B'
      ,'\\'       => '%5C'       // forward slash \
      ,']'        => '%5D'
      ,'^'        => '%5E'
      ,'{'        => '%7B'
      ,'|'        => '%7C'
      ,'}'        => '%7D'
      ,'`'        => '%60'
      ,chr(9)     => '%09'
      ,chr(10)    => '%0A'
      ,chr(13)    => '%0D'
      ,'¡'       => '%A1'
      ,'¢'       => '%A2'
      ,'£'       => '%A3'
      ,'¤'       => '%A4'
      ,'¥'       => '%A5'
      ,'¦'       => '%A6'
      ,'§'       => '%A7'
      ,'¨'       => '%A8'
      ,'©'       => '%A9'
      ,'ª'       => '%AA'
      ,'«'       => '%AB'
      ,'¬'       => '%AC'
      ,'¯'       => '%AD'
      ,'®'       => '%AE'
      ,'¯'       => '%AF'
      ,'°'       => '%B0'
      ,'±'       => '%B1'
      ,'²'       => '%B2'
      ,'³'       => '%B3'
      ,'´'       => '%B4'
      ,'µ'       => '%B5'
      ,'¶'       => '%B6'
      ,'·'       => '%B7'
      ,'¸'       => '%B8'
      ,'¹'       => '%B9'
      ,'º'       => '%BA'
      ,'»'       => '%BB'
      ,'¼'       => '%BC'
      ,'½'       => '%BD'
      ,'¾'       => '%BE'
      ,'¿'       => '%BF'
      ,'À'       => '%C0'
      ,'Á'       => '%C1'
      ,'Â'       => '%C2'
      ,'Ã'       => '%C3'
      ,'Ä'       => '%C4'
      ,'Å'       => '%C5'
      ,'Æ'       => '%C6'
      ,'Ç'       => '%C7'
      ,'È'       => '%C8'
      ,'É'       => '%C9'
      ,'Ê'       => '%CA'
      ,'Ë'       => '%CB'
      ,'Ì'       => '%CC'
      ,'Í'       => '%CD'
      ,'Î'       => '%CE'
      ,'Ï'       => '%CF'
      ,'Ð'       => '%D0'
      ,'Ñ'       => '%D1'
      ,'Ò'       => '%D2'
      ,'Ó'       => '%D3'
      ,'Ô'       => '%D4'
      ,'Õ'       => '%D5'
      ,'Ö'       => '%D6'
      ,'×'       => '%D7'
      ,'Ø'       => '%D8'
      ,'Ù'       => '%D9'
      ,'Ú'       => '%DA'
      ,'Û'       => '%DB'
      ,'Ü'       => '%DC'
      ,'Ý'       => '%DD'
      ,'Þ'       => '%DE'
      ,'ß'       => '%DF'
      ,'à'       => '%E0'
      ,'á'       => '%E1'
      ,'â'       => '%E2'
      ,'ã'       => '%E3'
      ,'ä'       => '%E4'
      ,'å'       => '%E5'
      ,'æ'       => '%E6'
      ,'ç'       => '%E7'
      ,'è'       => '%E8'
      ,'é'       => '%E9'
      ,'ê'       => '%EA'
      ,'ë'       => '%EB'
      ,'ì'       => '%EC'
      ,'í'       => '%ED'
      ,'î'       => '%EE'
      ,'ï'       => '%EF'
      ,'ð'       => '%F0'
      ,'ñ'       => '%F1'
      ,'ò'       => '%F2'
      ,'ó'       => '%F3'
      ,'ô'       => '%F4'
      ,'õ'       => '%F5'
      ,'ö'       => '%F6'
      ,'÷'       => '%F7'
      ,'ø'       => '%F8'
      ,'ù'       => '%F9'
      ,'ú'       => '%FA'
      ,'û'       => '%FB'
      ,'ü'       => '%FC'
      ,'ý'       => '%FD'
      ,'þ'       => '%FE'
      ,'ÿ'       => '%FF'
    );

    $convmap = array(0x80, 0x10ffff, 0, 0xffffff);

    $org = $source;

    // make sure string is UTF8
    if (false === mb_check_encoding($source, 'UTF-8')) {
        if (false === ($source = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $source))) {
          $source = $org;
        }
    }

    $chrArray = preg_split('//u', $source, -1, PREG_SPLIT_NO_EMPTY);  // split up the UTF8 string into chars
    $oChrArray = array();

    foreach ($chrArray as $index => $chr) {

      if (isset($map[$chr])) {
        $chr = $map[$chr];
      }
      // if char doesn't fall within ASCII then assume unicode, get the hex html entities
      //elseif (mb_detect_encoding($chr, 'ASCII', true) !== 'ASCII') {
      else {
        $chr = mb_encode_numericentity($chr, $convmap, "UTF-8", true);

        // since we will be converting the &#x notation to the non-standard %u for backward compatbility, make sure the code is 4 digits long by prepending 0p
        if (substr($chr, 0, 3) == '&#x' && substr($chr, -1) == ';' && strlen($chr) == 7)
          $chr = '&#x0'.substr($chr, 3);
      }

      $oChrArray[] = $chr;
    }
    $decodedStr = implode('', $oChrArray);
    $decodedStr = preg_replace('/&#x([0-9A-F]{4});/', '%u$1', $decodedStr);   // we need to use the %uXXXX format to simulate results generated with js escape()
    return $decodedStr;
}

/**
 *  Simulate javascript unescape() function
 */
function unescapejs($source) {
    $source = str_replace(array('%0B'), array(''), $source);    // stripe out vertical tab
    $s= preg_replace('/%u(....)/', '&#x$1;', $source);
    $s= preg_replace('/%(..)/', '&#x$1;', $s);
    return html_entity_decode($s, ENT_QUOTES, 'UTF-8');
}