Question

我将文件命名为实例Glacière_Service-de-lEducation-Ambassade-Chine_map.png。

完整路径应为http://example.com/.../Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png。（è = %CC%80）

但是，由于路径被解释为http://example.com/.../Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png，因此在发布帖子后图像未显示。（è = %C3%A8）

为什么è有不同的编码？

Answer 1

注意区别：

     ↓
Glacie%CC%80re_Service-de-lEducation-Ambassade-Chine_map.png
Glaci%C3%A8re_Service-de-lEducation-Ambassade-Chine_map.png

阅读Normalization Forms中的Unicode® Standard Annex #15: UNICODE NORMALIZATION FORMS。

不幸的是，我不会说 PHP ;但是，以下 python 示例可以提供帮助：

import unicodedata,urllib
from urllib import parse

x = unicodedata.lookup('Latin Small Letter E With Grave')
print(x, len(x))

y = unicodedata.normalize( 'NFKD', x)
print(y, len(y))

for char in (x + ' ' + y):
  print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))

<强>结果：

==> python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata,urllib
>>> from urllib import parse
>>>
>>> x = unicodedata.lookup('Latin Small Letter E With Grave')
>>> print(x, len(x))
è 1
>>>
>>> y = unicodedata.normalize( 'NFKD', x)
>>> print(y, len(y))
è 2
>>>
>>> for char in (x + ' ' + y):
...   print(char, urllib.parse.quote(char, safe='/'),unicodedata.name(char, '?'))
...
è %C3%A8 LATIN SMALL LETTER E WITH GRAVE
  %20 SPACE
e e LATIN SMALL LETTER E
̀ %CC%80 COMBINING GRAVE ACCENT
>>>
>>>

结果屏幕截图已添加，因为我无法阻止上述代码示例中NFKC字符串的e` 2规范化，请参阅print(y, len(y))的结果：

WordPress：文件名中的特殊字符

1 个答案: