I came accross a mind puzzling problem with mysql encoding today and would appreciate ideas on how to debug that further.
I had to update an old perl application, using mysql 5.6, which originally just in English and to which I had to add some unicode support (for khmer script).
I figured it would be best to do a test install. Took a dump of the prod db, imported into a test db, changed the charset of the tables that needed support to utf8 collate utf8_unicode_cli.
All worked well so went to apply to production. Ran the sql migration scripts to change charsets, deployed the new code and ... khmer characters do store/show fine but legacy è characters show as question mark with black square.
What really puzzles me is that
test and prod run on the same (windows) box, same mysql server instance
both test and prod databases have the same charsets et collation
for the table in question, test and prod show create table statements are identical
the same code connected to test works fine but connected to prod doesn't
I thought maybe the original data got mangled in the process so deleted it and reinserting it through the app interface. Still worked on test but not prod.
Same code works on test so code is probably not the issue. Both on same server instance so probably not server config issue. Khmer script works fine so probably not a utf "configuration" issue. New data is wrongly handled so probably not a data migration/convertion issue.
So 2 questions:
is the question mark with black square a sign of double encoding or just wrong encoding
how can I debug this further? Anyway to see "raw" mysql stored data for example so I could compare?
Any input greatly appreciated.
答案 0 :(得分:0)
尝试使用utf8 / utf8mb4时,如果您看到带有问号的黑钻石, 其中一种情况存在:
案例1(原始字节不 utf8):
SET NAMES
和 INSERT
的连接(或SELECT
)不是utf8 / utf8mb4。解决这个问题。CHARACTER SET utf8
(或utf8mb4)。案例2(原始字节 utf8):
SET NAMES
的连接(或SELECT
)不是utf8 / utf8mb4。解决这个问题。CHARACTER SET utf8
(或utf8mb4)。只有在浏览器设置为<meta charset=UTF-8>
不相关,但自从你提出来之后:
尝试使用utf8 / utf8mb4时,如果看到 Mojibake ,请检查以下内容。 此讨论适用于双重编码,但不一定可见。
INSERTing
和SELECTing
文字需要指定utf8或utf8mb4时的连接。CHARACTER SET utf8
(或utf8mb4)。<meta charset=UTF-8>
开头。