我想比较两个字符串,以便比较应忽略特殊字符的差异。也就是说,
Hai,这是一个测试
应与
匹配海!这是一个测试“或”海这是一个测试
有没有办法在不修改原始字符串的情况下执行此操作?
答案 0 :(得分:15)
这会在进行比较之前删除标点符号和空格:
In [32]: import string
In [33]: def compare(s1, s2):
...: remove = string.punctuation + string.whitespace
...: return s1.translate(None, remove) == s2.translate(None, remove)
In [34]: compare('Hai, this is a test', 'Hai ! this is a test')
Out[34]: True
答案 1 :(得分:7)
>>> def cmp(a, b):
... return [c for c in a if c.isalpha()] == [c for c in b if c.isalpha()]
...
>>> cmp('Hai, this is a test', 'Hai ! this is a test')
True
>>> cmp('Hai, this is a test', 'Hai this is a test')
True
>>> cmp('Hai, this is a test', 'other string')
False
这会创建两个临时列表,但不会以任何方式修改原始字符串。
答案 2 :(得分:1)
比较字母等价的任意数量的字符串,
def samealphabetic(*args):
return len(set(filter(lambda s: s.isalpha(), arg) for arg in args)) <= 1
print samealphabetic('Hai, this is a test',
'Hai ! this is a test',
'Hai this is a test')
打印True
。应该更改<=
,具体取决于您想要返回的参数。
答案 3 :(得分:0)
通常,你要替换你想忽略的字符,然后比较它们:
import re
def equal(a, b):
# Ignore non-space and non-word characters
regex = re.compile(r'[^\s\w]')
return regex.sub('', a) == regex.sub('', b)
>>> equal('Hai, this is a test', 'Hai this is a test')
True
>>> equal('Hai, this is a test', 'Hai this@#)($! i@#($()@#s a test!!!')
True
答案 4 :(得分:0)
也许你可以先删除两个字符串中的特殊字符,然后比较它们。
在您的示例中,特殊字符为',','!'和空间。
所以你的字符串:
a='Hai, this is a test'
b='Hai ! this is a test'
tempa=a.translate(None,',! ')
tempb=b.translate(None,',! ')
然后你可以比较tempa和tempb。
答案 5 :(得分:0)
使用Levenshtein metric测量两个字符串之间的距离。按分数对字符串比较进行排名。选择顶部 n 匹配。
答案 6 :(得分:0)
由于您提到您不想修改原始字符串,您还可以执行就地操作,而无需任何额外空间。
>>> import string
>>> first = "Hai, this is a test"
>>> second = "Hai ! this is a test"
>>> third = "Hai this is a test"
>>> def my_match(left, right):
i, j = 0, 0
ignored = set(string.punctuation + string.whitespace)
while i < len(left) and j < len(right):
if left[i] in ignored:
i += 1
elif right[j] in ignored:
j += 1
elif left[i] != right[j]:
return False
else:
i += 1
j += 1
if i != len(left) or j != len(right):
return False
return True
>>> my_match(first, second)
True
>>> my_match(first, third)
True
>>> my_match("test", "testing")
False