将转义的UTF-16(JSON实体)转换回MySQL内的正常UTF-8

时间:2012-04-07 18:00:49

标签: mysql utf-8 utf-16

如何转换此类数据

\u0441\u043e\u0432\u0440\u0435\u043c\u0435\u043d

在MySQL里面恢复正常的UTF8?它甚至可能吗?

1 个答案:

答案 0 :(得分:0)

您可以通过将数据读取为二进制文件并将其转换为目标字符集来转换为mysql中的字符集,如下所示(如果您有一个名为example的表和一个名为data的列:

UPDATE `example` SET data=CONVERT(CONVERT(`data` USING binary) USING utf8);

JSON \ uXXXX实体采用十六进制UTF-16编码,因此如果您拥有支持utf-16的mysql版本,则可以将它们转换为utf-8。以下功能显示了如何操作。首先,你unhex()这些值,然后从UTF-16转换为UTF-8:

DELIMITER @@
CREATE FUNCTION Unjson (instring TEXT CHARACTER SET utf8)
RETURNS TEXT CHARACTER SET utf8
BEGIN

  DECLARE i INT DEFAULT 0;
  DECLARE c VARCHAR(1);
  DECLARE utfstr TEXT CHARACTER SET utf16 DEFAULT "";
  DECLARE outstring TEXT CHARACTER SET utf8 DEFAULT "";

  WHILE i < CHAR_LENGTH(instring) DO
    SET i = i + 1;
    SET c = SUBSTRING(instring, i, 1);
    IF c = "\\" THEN
      SET c = SUBSTRING(instring, i + 1, 1);
      IF c = "u" THEN
        SET utfstr = CONCAT(utfstr, UNHEX(SUBSTRING(instring, i + 2, 4)));
        SET i = i + 5;
      END IF;
    ELSE
      IF utfstr != "" THEN
        SET outstring = CONCAT(outstring, CONVERT(utfstr USING utf8));
        SET utfstr = "";
      END IF;
      SET outstring = CONCAT(outstring, c);
    END IF;
  END WHILE;
  IF utfstr != "" THEN
    SET outstring = CONCAT(outstring, CONVERT(utfstr USING utf8));
  END IF;

  RETURN outstring;
END@@
DELIMITER ;

使用此MySQL功能,您可以使用以下内容转换表:

UPDATE `table_name` SET `column_name`=Unjson(`column_name`);

我使用MySQL 5.5,但我不认为5.0支持utf-16,所以你可能需要检查你的MySQL版本......

(是的,我建议您在生产环境中运行之前进行备份......);)