我将文件命名为实例Glacière_Service-de-lEducation-Ambassade-Chine_map.png
。
完整路径应为http://example.com/.../Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png
。 (è
= %CC%80
)
但是,由于路径被解释为http://example.com/.../Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png
,因此在发布帖子后图像未显示。 (è
= %C3%A8
)
为什么è
有不同的编码?
答案 0 :(得分:1)
注意区别:
↓
Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png
Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png
阅读Normalization Forms中的Unicode® Standard Annex #15: UNICODE NORMALIZATION FORMS。
不幸的是,我不会说 PHP ;但是,以下 python 示例可以提供帮助:
import unicodedata,urllib
from urllib import parse
x = unicodedata.lookup('Latin Small Letter E With Grave')
print(x, len(x))
y = unicodedata.normalize( 'NFKD', x)
print(y, len(y))
for char in (x + ' ' + y):
print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))
<强>结果强>:
==> python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata,urllib
>>> from urllib import parse
>>>
>>> x = unicodedata.lookup('Latin Small Letter E With Grave')
>>> print(x, len(x))
è 1
>>>
>>> y = unicodedata.normalize( 'NFKD', x)
>>> print(y, len(y))
è 2
>>>
>>> for char in (x + ' ' + y):
... print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))
...
è %C3%A8 LATIN SMALL LETTER E WITH GRAVE
%20 SPACE
e e LATIN SMALL LETTER E
̀ %CC%80 COMBINING GRAVE ACCENT
>>>
>>>
结果屏幕截图已添加,因为我无法阻止上述代码示例中NFKC
字符串的e` 2
规范化,请参阅print(y, len(y))
的结果: