删除任何一对具有相同值的相邻字母。例如,字符串“aabcc”在操作后将变为“aab”或“bcc”。
示例输入= aaabccddd
样本输出= abd
如何迭代列表或字符串以匹配重复项并删除它们,这是我尝试的方式,我知道这是错误的。
S = input()
removals = []
for i in range(0, len(S)):
if i + 1 >= len(S):
break
elif S[i] == S[i + 1]:
removals.append(i)
# removals is to store all the indexes that are to be deleted.
removals.append(i + 1)
i += 1
print(i)
Array = list(S)
set(removals) #removes duplicates from removals
for j in range(0, len(removals)):
Array.pop(removals[j]) # Creates IndexOutOfRange error
这是Hackerrank的一个问题:Super Reduced String
答案 0 :(得分:1)
如果存在偶数个字母,则删除成对字母可以减少为减少字母到空序列的运行,如果存在奇数则可以减少1。 aaaaaa
变为空,aaaaa
缩小为a
。
要对任何序列执行此操作,请使用itertools.groupby()
并计算组大小:
# only include a value if their consecutive count is odd
[v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]
然后重复,直到序列的大小不再变化:
prev = len(sequence) + 1
while len(sequence) < prev:
prev = len(sequence)
sequence = [v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]
但是,由于Hackerrank为您提供 text ,如果您使用正则表达式执行此操作,它会更快:
import re
even = re.compile(r'(?:([a-z])\1)+')
prev = len(text) + 1
while len(text) < prev:
prev = len(text)
text = even.sub(r'', text)
正则表达式中的 [a-z]
匹配小写字母,(..)groups that match, and
\ 1 references the first match and will only match if that letter was repeated.
(?:...)+ asks for repeats of the same two characters.
re.sub ()`用空文本替换所有这些模式。
正则表达式方法足以通过Hackerrank挑战。
答案 1 :(得分:1)
您可以使用堆栈来实现 O(n)时间复杂度。迭代字符串中的字符,并为每个字符检查堆栈顶部是否包含相同的字符。如果它确实从堆栈弹出字符并移动到下一个项目。否则将角色推入堆栈。无论堆栈中剩余的是什么结果:
s = 'aaabccddd'
stack = []
for c in s:
if stack and stack[-1] == c:
stack.pop()
else:
stack.append(c)
print ''.join(stack) if stack else 'Empty String' # abd
更新根据讨论,我运行了几项测试,以测量输入长度为100
的正则表达式和基于堆栈的解决方案的速度。测试在Windows 8上的Python 2.7上运行:
All same
Regex: 0.0563033799756
Stack: 0.267807865445
Nothing to remove
Regex: 0.075074750044
Stack: 0.183467329017
Worst case
Regex: 1.9983200193
Stack: 0.196362265609
Alphabet
Regex: 0.0759905517997
Stack: 0.182778728207
用于基准测试的代码:
import re
import timeit
def reduce_regexp(text):
even = re.compile(r'(?:([a-z])\1)+')
prev = len(text) + 1
while len(text) < prev:
prev = len(text)
text = even.sub(r'', text)
return text
def reduce_stack(s):
stack = []
for c in s:
if stack and stack[-1] == c:
stack.pop()
else:
stack.append(c)
return ''.join(stack)
CASES = [
['All same', 'a' * 100],
['Nothing to remove', 'ab' * 50],
['Worst case', 'ab' * 25 + 'ba' * 25],
['Alphabet', ''.join([chr(ord('a') + i) for i in range(25)] * 4)]
]
for name, case in CASES:
print(name)
res = timeit.timeit('reduce_regexp(case)',
setup='from __main__ import reduce_regexp, case; import re',
number=10000)
print('Regex: {}'.format(res))
res = timeit.timeit('reduce_stack(case)',
setup='from __main__ import reduce_stack, case',
number=10000)
print('Stack: {}'.format(res))