Question

嗨，我已经尝试了一段时间了，现在还没有结果。我有dict = {'Å':'a', 'Ä':'a', 'Ö':'0', 'å':'a', 'ä':'a', 'ö':'o'}

 input = lxml.etree.parse(inputxml)
 for block in input.xpath('//PAGE/BLOCK/TEXT'):
    J = block.xpath('TOKEN/text()')
    current = 0
    line = ""
    while current < len(J):
        A = J[current]
        current += 1

我需要用dict扫描A并找到非英文字母并用英文字母替换

   for i in A:
        if(dict.has_key(i)):
              ReplaceWord= A.replace(i,dict[i])

但这不起作用

Answer 1

不是您提出的问题，但看起来您可能对此感兴趣：Unidecode是一个专门用于将任何字符序列缩减为最相似的ASCII字符的模块。

>>> import unidecode # to install: `pip install unidecode`
>>> line = u"Flyttbara hyllplan anpassar förvaringen så"
>>> unidecode.unidecode(line)
u'Flyttbara hyllplan anpassar forvaringen sa'

Answer 2

translate就是您所需要的。

d=str.maketrans('ÅÄÖåäö','aaoaao')
s.translate(d)

Answer 3

在python 3和2.x中都有：

letters = {'Å':'a', 'Ä':'a', 'Ö':'0', 'å':'a', 'ä':'a', 'ö':'o'}
line = "Flyttbara hyllplan anpassar förvaringen så"
for c in letters:
    line = line.replace(c, letters[c])

使用python中的字典扫描列表

3 个答案: