我正在尝试敲诈UTF-8文本,在这种情况下涉及æøå等字符,我想维护它。
当我使用slugify时,它不会保持UTF-8字符:
>>> from slugify import slugify
>>> slugify(u'æsel (øen)')
'aesel-oen'
应为æsel-øen
。
答案 0 :(得分:3)
使用不同的库来进行slugify; unicode-slugify
library输出完全符合您的要求:
$ bin/pip install unicode-slugify
Downloading/unpacking unicode-slugify
Downloading unicode-slugify-0.1.1.tar.gz
Running setup.py (path:/.../build/unicode-slugify/setup.py) egg_info for package unicode-slugify
Downloading/unpacking django (from unicode-slugify)
Downloading Django-1.7-py2.py3-none-any.whl (7.4MB): 7.4MB downloaded
Installing collected packages: unicode-slugify, django
Running setup.py install for unicode-slugify
Successfully installed unicode-slugify django
Cleaning up...
$ bin/python
Python 2.7.8 (default, Sep 19 2014, 22:15:41)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from slugify import slugify
>>> slugify(u'æsel (øen)')
u'\xe6sel-\xf8en'
>>> print slugify(u'æsel (øen)')
æsel-øen