关于将表情符号插入mysql表的一些令人困惑的现象

时间:2015-05-04 06:03:56

标签: mysql emoji

在mysql交互界面中插入表情符号字符时,我发现一些非常混乱的现象。希望有人能清除它。现在看下面:

mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name            | Value                                 |
+--------------------------+---------------------------------------+
| character_set_client     | utf8                                  |
| character_set_connection | utf8                                  |
| character_set_database   | latin1                                |
| character_set_filesystem | binary                                |
| character_set_results    | utf8                                  |
| character_set_server     | latin1                                |
| character_set_system     | utf8                                  |
| character_sets_dir       | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
CREATE TABLE `t` (
`data` varchar(100) CHARACTER SET utf8mb4 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> insert into t select '\U+1F600';
ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80' for column 'data' at row 1
mysql> set names utf8mb4;
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.00 sec)
mysql> select * from t;
+------+
| data |
+------+
|      |
+------+
mysql> select data, hex(data) from t;
+------+-----------+
| data | hex(data) |
+------+-----------+
|      | F09F9880  |
+------+-----------+

为什么我需要显式执行集名称utf8mb4?从错误消息,它似乎成功地将数据内容解析为四个字节(f0 9f 98 80)?为什么还不能成功插入?

以下是我的另一个难题。

mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name            | Value                                 |
+--------------------------+---------------------------------------+
| character_set_client     | latin1                                |
| character_set_connection | latin1                                |
| character_set_database   | latin1                                |
| character_set_filesystem | binary                                |
| character_set_results    | latin1                                |
| character_set_server     | latin1                                |
| character_set_system     | utf8                                  |
| character_sets_dir       | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.01 sec)
mysql> select data,hex(data) from t;
+------+--------------------+
| data | hex(data)          |
+------+--------------------+
|      | C3B0C5B8CB9CE282AC |
+------+--------------------+

我不得不说我对此感到有点震惊。在我看来只有utf8mb4支持表情符号字符,但现在latin1也支持表情符号字符。 任何人都可以为我清除它。谢谢!

1 个答案:

答案 0 :(得分:0)

您可以将UTF8数据插入到latin1表中,但MySQL不会将字节流视为UTF8字符。因此,您无法对其进行查询,例如。如果您的应用程序理解UTF8字节流,那么它看起来就像它正常工作一样。但是如果MySQL要将这些字节理解为Unicode字符,那么表charset确实需要是utf8(或utf8mb4)。