MySQL编码4字节的4字节utf-8 - 字符串值不正确

时间:2014-02-17 20:55:41

标签: python mysql django utf-8

根据mysql documentation仅支持最多3字节的utf-8 unicode编码。 我的问题是,如何在我的数据库中替换需要4字节utf-8编码的字符?我如何解码这些字符才能准确显示用户写的内容?

集成测试的一部分:

description = u'baaam á ✓ ✌ ❤'
print description
test_convention = Blog.objects.create(title="test title",
                                            description=description,
                                            login=self.user,
                                            tag=self.tag)

错误:

Creating test database for alias 'default'...
baaam á ✓ ✌ ❤
E..
======================================================================
ERROR: test_post_blog (blogs.tests.PostTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/admin/Developer/project/pro/blogs/tests.py", line 64, in test_post_blog
    tag=self.tag)
  File "build/bdist.macosx-10.9-intel/egg/MySQLdb/cursors.py", line 201, in execute
    self.errorhandler(self, exc, value)
  File "build/bdist.macosx-10.9-intel/egg/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
DatabaseError: (1366, "Incorrect string value: '\\xE2\\x9C\\x93 \\xE2\\x9C...' for column 'description' at row 1")

----------------------------------------------------------------------
Ran 3 tests in 1.383s

FAILED (errors=1)
Destroying test database for alias 'default'...

表的配置:

+----------------------------------+--------+---------+-------------------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+----------+----------------+---------+
| Name                             | Engine | Version | Collation         | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Checksum | Create_options | Comment |
+----------------------------------+--------+---------+-------------------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+----------+----------------+---------+
| blogs_blog                       | InnoDB |      10 | utf8_general_ci   | Compact    |   25 |           1966 |       49152 |               0 |        32768 |         0 |             35 | 2014-02-09 00:57:59 | NULL        | NULL       |     NULL |                |         |
+----------------------------------+--------+---------+-------------------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+----------+----------------+---------+

更新:我已经将表格和列配置从utf-8更改为utf8mb4并仍然得到相同的错误,任何想法?

+----------------------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+
| Name                             | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Collation          | Checksum | Create_options | Comment |
+----------------------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+
| blogs_blog                       | InnoDB |      10 | Compact    |    5 |           3276 |       16384 |               0 |        32768 |         0 |             36 | 2014-02-17 22:24:18 | NULL        | NULL       | utf8mb4_general_ci |     NULL |                |         |
+----------------------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+

+---------------+--------------+--------------------+------+-----+---------+----------------+---------------------------------+---------+
| Field         | Type         | Collation          | Null | Key | Default | Extra          | Privileges                      | Comment |
+---------------+--------------+--------------------+------+-----+---------+----------------+---------------------------------+---------+
| id            | int(11)      | NULL               | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| title         | varchar(500) | latin1_swedish_ci  | NO   |     | NULL    |                | select,insert,update,references |         |
| description   | longtext     | utf8mb4_general_ci | YES  |     | NULL    |                | select,insert,update,references |         |
| creation_date | datetime     | NULL               | NO   |     | NULL    |                | select,insert,update,references |         |
| login_id      | int(11)      | NULL               | NO   | MUL | NULL    |                | select,insert,update,references |         |
| tag_id        | int(11)      | NULL               | NO   | MUL | NULL    |                | select,insert,update,references |         |
+---------------+--------------+--------------------+------+-----+---------+----------------+---------------------------------+---------+

1 个答案:

答案 0 :(得分:0)

支持,但不支持utf8。将以下内容添加到[mysqld]的{​​{1}}部分:

my.cnf

创建数据库时,请使用:

character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci

CREATE DATABASE xxxxx DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci; 命令的末尾添加:

CREATE TABLE