Question

我试图用Python替换葡萄牙语单词中的一些重音字母。

Private Sub ActivtyM_Change()

Range("C3").Select
ActiveCell.FormulaR1C1 = "=RC[-1]" * Sheets("OtherSheet").Cells(ComboBox1.ListIndex + 1, 2).Value

End Sub

因此，accentedLetters将被字母数组中的字母替换。

通过这种方式，我的预期结果是例如：

accentedLetters = ['à', 'á', 'â', 'ã', 'é', 'ê', 'í', 'ó', 'ô', 'õ', 'ú', 'ü']
letters         = ['a', 'a', 'a', 'a', 'e', 'e', 'i', 'o', 'o', 'o', 'u', 'u']

我该怎么做？

Answer 1

一个简单的翻译词典应该可以解决问题。对于每个字母，如果字母在字典中，请使用其翻译。否则，请使用原件。将各个角色加入一个单词。

def removeAccents(word):
    repl = {'à': 'a', 'á': 'a', 'â': 'a', 'ã': 'a',
            'é': 'e', 'ê': 'e',
            'í': 'i',
            'ó': 'o', 'ô': 'o', 'õ': 'o',
            'ú': 'u', 'ü': 'u'}

    new_word = ''.join([repl[c] if c in repl else c for c in word])
    return new_word

Answer 2

您可以查看Python3的Unidecode库。

例如：

from unidecode import unidecode

a = ['à', 'á', 'â', 'ã', 'é', 'ê', 'í', 'ó', 'ô', 'õ', 'ú', 'ü']

for k in a:
    print (unidecode(u'{0}'.format(k)))

结果：

一一个一个一个 Ë Ë 一世 Ø Ø Ø ü û

Answer 3

我终于解决了我的问题：

#! /usr/bin/python
# -*- coding: utf-8 -*-

import sys

def removeAccents(word):
    replaceDict = {'à'.decode('utf-8'): 'a', 
                   'á'.decode('utf-8'): 'a',
                   'â'.decode('utf-8'): 'a',
                   'ã'.decode('utf-8'): 'a',
                   'é'.decode('utf-8'): 'e',
                   'ê'.decode('utf-8'): 'e',
                   'í'.decode('utf-8'): 'i',
                   'ó'.decode('utf-8'): 'o',
                   'ô'.decode('utf-8'): 'o',
                   'õ'.decode('utf-8'): 'o',
                   'ú'.decode('utf-8'): 'u',
                   'ü'.decode('utf-8'): 'u'}

    finalWord = ''
    for letter in word:
        if letter in replaceDict:
            finalWord += replaceDict[letter]
        else:
            finalWord += letter
    return finalWord


word = (sys.argv[1]).decode('utf-8')
print removeAccents(word)

这正如我预期的那样有效。

Answer 4

使用正则表达式的另一个简单选项：

import re

def remove_accents(string):
    if type(string) is not unicode:
        string = unicode(string, encoding='utf-8')

    string = re.sub(u"[àáâãäå]", 'a', string)
    string = re.sub(u"[èéêë]", 'e', string)
    string = re.sub(u"[ìíîï]", 'i', string)
    string = re.sub(u"[òóôõö]", 'o', string)
    string = re.sub(u"[ùúûü]", 'u', string)
    string = re.sub(u"[ýÿ]", 'y', string)

    return string

用python中的单词替换一些重音字母

4 个答案: