什么tbl = str.maketrans({ord(ch):“”用于标点符号})是什么意思?

时间:2015-12-04 09:43:40

标签: python python-3.x

这是完整的代码:

s = 'life is short, stunt it!!?'
from string import punctuation
tbl = str.maketrans({ord(ch):" " for ch in punctuation})
print(s.translate(tbl).split())

我想知道tbl = str.maketrans({ord(ch):" " for ch in punctuation})在此代码中的含义,或者一般情况下是什么意思?

2 个答案:

答案 0 :(得分:2)

它为空格构建一个标点符号字典,翻译字符串(有效删除标点符号),然后在空白处拆分以生成单词列表。

循序渐进...首先构建一个字符翻译字典,其中键是标点符号,替换字符是空格。这使用dict理解来构建字典:

from string import punctuation
s = 'life is short, stunt it!!?'
D = {ord(ch):" " for ch in punctuation}
print(D)

结果:

{64: ' ', 124: ' ', 125: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 123: ' ', 126: ' ', 58: ' ', 59: ' ', 60: ' ', 61: ' ', 62: ' ', 63: ' '}

这一步骤是多余的。虽然字典看起来不同,但字典是无序的,键和值是相同的。 maketrans可以做的是根据translate的要求将字符键转换为序数值,但这在创建字典时已经完成。它还有其他用例,因此不能使用maketrans

tbl = str.maketrans(D)
print(tbl)
print(D == tbl)

结果:

{64: ' ', 60: ' ', 61: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 59: ' ', 62: ' ', 58: ' ', 123: ' ', 124: ' ', 125: ' ', 126: ' ', 63: ' '}
True

现在进行翻译:

s = s.translate(tbl)
print(s)

结果:

life is short  stunt it   

拆分为单词列表:

print(s.split())

结果:

['life', 'is', 'short', 'stunt', 'it']

答案 1 :(得分:0)

{ord(ch):" " for ch in punctuation}dictionary comprehension

这些与({3}}类似(并基于){/ p>

list comprehensions

您可以从Python shell运行此代码以查看每行的作用:

>>> from string import punctuation
>>> punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> punctuation_to_spaces = {ord(ch): " " for ch in punctuation}
>>> punctuation_to_spaces
{64: ' ', 124: ' ', 125: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 123: ' ', 126: ' ', 58: ' ', 59: ' ', 60: ' ', 61: ' ', 62: ' ', 63: ' '}
>>> punctuation_removal = str.maketrans(punctuation_to_spaces)
>>> punctuation_removal
{64: ' ', 60: ' ', 61: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 59: ' ', 62: ' ', 58: ' ', 123: ' ', 124: ' ', 125: ' ', 126: ' ', 63: ' '}
>>> s = 'life is short, stunt it!!?'
>>> s.translate(punctuation_removal)
'life is short  stunt it   '

字典理解线基本上是将标点符号的ASCII值字典作为键,空格字符作为值。 .translate字符串上的s调用然后使用该字典将标点字符转换为空格。

ord函数将每个标点符号转换为ASCII数值。

请注意,使用ordmaketrans是多余的。这些解决方案中的任何一个都可以正常工作,并且不会进行双重翻译:

tbl = str.maketrans({ch:" " for ch in punctuation})
print(s.translate(tbl).split())

tbl = {ord(ch):" " for ch in punctuation}
print(s.translate(tbl).split())