最近我注意到在我的数据库中一些 unicode字符(大多数是日语)变成了难以理解的垃圾。例如:
「オリジナル」桜庭わかな - 远游荡〜7つめのアレクトラ
变成了
:一种€Œã,ªãƒªã,¸ãƒŠãƒ«ã€æ¡œåºã,ã<㪠- 远wandering~7ã¤ã,ã®ã,¢ãƒ¬ã,¯ãƒãƒ©
数据来自YouTube API(使用PHP),然后放入我的数据库。我想谷歌可能会在短时间内破坏某些东西然后修复它。但是现在由于封闭频道和/或已删除的视频,我无法使用Youtube API获取大量数据。
我已经尝试过:
我被告知我服务器上的mysql使用的是xtradb而不是InnoDB。 以下是有关我的数据库和表格的信息
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
mysql> SHOW FULL COLUMNS FROM my_table;
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
| id | int(11) | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| timestamp | timestamp | NULL | NO | | CURRENT_TIMESTAMP | | select,insert,update,references | |
| etag | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| video_id | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| published | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| title | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| description | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| thumb | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| channel_id | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| channel_name | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| category | decimal(10,0) | NULL | YES | | NULL | | select,insert,update,references | |
| duration | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| definition | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
13 rows in set (0.00 sec)