PHP中的json_encode()返回转义Unicode中的中文字符

时间:2009-11-29 03:48:58

标签: php json

我有一个简单的PHP一维数组。

当我执行var转储(echo var_dump($a))时,我将其作为输出:

array(3) { [0]=>  string(3) "尽" [1]=>  string(21) "exhausted||to exhaust" [2]=>  string(4) "jin3" }

然而,当我json_encode它(echo json_encode($a))时,我得到了这个:

["\u5c3d","exhausted||to exhaust","jin3"]

它返回的十六进制值是正确的,但我无法弄清楚如何停止它给我十六进制。我只是想让它显示角色。

如果我echo mb_internal_encoding()它返回UTF-8,这就是我设置的内容。我在所有的字符串操作中都非常小心地使用mb_函数,所以没有数据搞砸了。

我知道我可以写一个修改过的json_encode函数来处理这个问题。但我想知道这里发生了什么。

6 个答案:

答案 0 :(得分:4)

json_encode()的行为完全正确,但不必要。在PHP 5.4中,可以使用JSON_UNESCAPED_UNICODE标志禁用它。

答案 1 :(得分:3)

我知道这个问题比较老,但我认为我借用了我的中国to_jsonto_utf8函数 - 其中包括一些很好的格式(JSON_PRETTY_PRINT),在开发时与缩小的生产。 (适应您自己的环境/系统)


简单

// Produces JSON with Chinese Characters fully un-encoded.
// NOT RFC4627 compliant
json_encode($data, JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES); 


to_json()

function to_json($data, $pretty=null, $inculde_security=false, $try_to_recover=true) {
  // @Note: json_encode() *REQUIRES* data to be in valid UTF8 format BEFORE
  //                    trying to json_encode   and since we are working with Chinese
  //                    characters, we need to make sure that we explicitly allow:
  //                    JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES
  //                    *Unless a mode is explicitly passed into the function
    $json_encoded = '{}';
    if ($pretty === null && is_env_prod()) { // @NOTE: Substitute with your own Production env check
        $json_encoded = json_encode( $data, JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES );
    } else if ($pretty === null && is_env_dev()){ // @NOTE: Substitute with your own Development env check
        $json_encoded = json_encode( $data, JSON_PRETTY_PRINT|JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES );
    } else {
        // PRODUCTION
        $json_encoded = json_encode( $data, $pretty );
    }



    // (1) Do not return an error if the inital data was empty
    // (2) Return an error if json_encode() failed
    if (json_last_error() > 0) {
        if (!!$data || !empty($data)) {
            if (!$json_encoded == false || empty($json_encoded) || $json_encoded == '{}') {
                $json_encoded = json_encode([
                    'status' => false,
                    'error' => [
                        'json_last_error' => json_last_error(),
                        'json_last_error_msg' => json_last_error_msg()
                    ]
                ]);
            } else if (!!$try_to_recover) {
                // there was data in $data so lets try to forensically recover a little? by removing $k => $v pairs that fail to be JSON encoded
                foreach (((array) $data) as $k => $v) {
                    if (!json_encode([$k => $v])) {
                        if (is_array($data)) {
                            unset($data[$k]);
                        } else if (is_object($data)) {
                            unset($data->{$k});
                        }
                    }
                }

                // if the data still is not empty, and there is a status set in the data
                //      then set it to false and add a error message/data
                //      ONLY for Array & Objects
                if (!empty($json_encoded) && count($json_encoded) < 1) {
                    if (!json_encode($data)) {
                        if (is_array($json_encoded)) {
                            $json_encoded['status'] = false;
                            $json_encoded['message'] = "json_encoding_error";
                            $json_encoded['error'] = [
                                'json_last_error' => json_last_error(),
                                'json_last_error_msg' => json_last_error_msg()
                            ];
                        } else if (is_object($json_encoded)) {
                            $json_encoded->status = false;
                            $json_encoded->message = "json_encoding_error";
                            $json_encoded->error = [
                                'json_last_error' => json_last_error(),
                                'json_last_error_msg' => json_last_error_msg()
                            ];
                        }
                    } else {
                      // We have removed the offending data
                      return to_json($data, $pretty, $include_security, $try_to_recover);
                    }
                }

                // we've cleaned out any data that was causing the problem, and included
                //      false to indicate this is a one-time recursion recovery.
                return $this->to_json($pretty, $include_security, false);
            }
        } else { } // don't do anything as the value is already false
    }

  return ( ($inculde_security) ? ")]}',\n" : '' ) . $json_encoded;
}

另一个可能有用的功能是我的递归to_utf8()功能:

to_utf8()

// @NOTE: Common Chinese GBK encoding: to_utf8($data, 'GB2312')
function to_utf8($in, $source_encoding='HTML-ENTITIES') {
  if (is_string($in)) {
    return mb_convert_encoding(
      $in,
      $source_encoding,
      'UTF-8'
    );
  } else if (is_array($in) || is_object($in)) {

    array_walk_recursive($in, function(&$item, &$key) {
      $key = to_utf8($key);

      if (is_object($item) || is_array($item)) {
        $item = to_utf8($item);
      } else {
        if (!mb_detect_encoding($item, 'UTF-8', true)){
          $item = utf8_encode($item);
        }
      }
    });

    $ret_object = is_object($in);
    return ($ret_object) ? (object) $in : (array) $in;
  }

  return $in;
}

验证RFC4627(有效JSON)

$pcre_regex = '
  /
  (?(DEFINE)
     (?<number>   -? (?= [1-9]|0(?!\d) ) \d+ (\.\d+)? ([eE] [+-]? \d+)? )
     (?<boolean>   true | false | null )
     (?<string>    " ([^"\\\\]* | \\\\ ["\\\\bfnrt\/] | \\\\ u [0-9a-f]{4} )* " )
     (?<array>     \[  (?:  (?&json)  (?: , (?&json)  )*  )?  \s* \] )
     (?<pair>      \s* (?&string) \s* : (?&json)  )
     (?<object>    \{  (?:  (?&pair)  (?: , (?&pair)  )*  )?  \s* \} )
     (?<json>   \s* (?: (?&number) | (?&boolean) | (?&string) | (?&array) | (?&object) ) \s* )
  )
  \A (?&json) \Z
  /six
';

$matches = false;
preg_match($pcre_regex, trim($body), $matches);

var_dump('RFC4627 Verification (Regex) ', [
  'has_passed' => (count($matches) == 1) ? 'YES' : 'NO',
  'matches'    => $matches
]);

is_json()

// One Liner
is_string($json_string) && !preg_match('/[^,:{}\\[\\]0-9.\\-+Eaeflnr-u \\n\\r\\t]/', preg_replace('/"(\\.|[^"\\\\])*"/', '', $json_string));

// Alt Function — more consistant
function is_json($json_string) {
  if (!is_string($json_string) || is_numeric($json_string)) {
      return false;
  }

  $val = @json_decode($json_string);

  return ($val != null) && (json_last_error() === JSON_ERROR_NONE);

  // Inconsistant results, reverted to json_decode() + JSON_ERROR_NONE check
  // return is_string($json_string) && !preg_match('/[^,:{}\\[\\]0-9.\\-+Eaeflnr-u \\n\\r\\t]/', preg_replace('/"(\\.|[^"\\\\])*"/', '', $json_string));
}

is_utf8()

function is_utf8($str) {
  if (is_array($str)) {
    foreach ($str as $k=>$v) {
      if (is_string($v) && !is_utf8($v)) {
        return false;
      }
    }
  }

  return (is_string($str) && preg_match('//u', $str));
}

答案 2 :(得分:0)

我对json_encode()做了同样的调查。我的结论是你不能改变行为,但它不会导致任何问题,所以我将离开。

如果你真的不喜欢它,请在preg_replace_callback()输出上json_encode()并将代码点hex转换回UTF-8字符。

答案 3 :(得分:0)

如果您使用mysql创建阵列,可以使用:

mysql_query("SET NAMES utf8");

这将在utf8中创建一个结果。

我没有尝试,但你可能希望在php中查看utf8_encodeutf8_decode函数。

答案 4 :(得分:0)

尝试此操作,您正在json中发布数据,并且想要提交中文或日语或古兰经或任何其他语言字符

import React, { Component } from 'react';
import { BrowserRouter, Route, Switch } from "react-router-dom";

import HomeComponent from '../Components/Home';
import LoginComponent from '../Components/Login';
import SignupComponent from '../Components/Signup';
import MyStoriesComponent from '../Components/Story/MyStories';
import AddStoryComponent from '../Components/Story/AddStory';

export default class Routes extends Component {
    render() {
        return (
            <BrowserRouter>
                <Switch>
                    <Route exact path="/" component={HomeComponent} />
                    <Route exact path="/login" component={LoginComponent} />
                    <Route exact path="/signup" component={SignupComponent} />
                    <Route exact path="/mystories" component={MyStoriesComponent} />
                    <Route exact path="/addstory" component={AddStoryComponent} />
                    {/*<Route exact path="/listPartnerAccount/:id" component={ListPartnerAccount} />*/}
                </Switch>
            </BrowserRouter>
        );
    }
}

答案 5 :(得分:0)

json_encode()

有一些选择 用这种方式

json_encode($data,JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES);