如果相邻配对,删除重复项

时间:2016-08-06 09:16:02

标签: python arrays string list

删除任何一对具有相同值的相邻字母。例如,字符串“aabcc”在操作后将变为“aab”或“bcc”。

示例输入= aaabccddd
样本输出= abd

如何迭代列表或字符串以匹配重复项并删除它们,这是我尝试的方式,我知道这是错误的。

S = input()
removals = []

for i in range(0, len(S)):
    if i + 1 >= len(S):
        break

    elif S[i] == S[i + 1]:
        removals.append(i)    
        # removals is to store all the indexes that are to be deleted.
        removals.append(i + 1)
        i += 1
    print(i)
Array = list(S)
set(removals)    #removes duplicates from removals

for j in range(0, len(removals)):
    Array.pop(removals[j])    # Creates IndexOutOfRange error

这是Hackerrank的一个问题:Super Reduced String

2 个答案:

答案 0 :(得分:1)

如果存在偶数个字母,则删除成对字母可以减少为减少字母到空序列的运行,如果存在奇数则可以减少1。 aaaaaa变为空,aaaaa缩小为a

要对任何序列执行此操作,请使用itertools.groupby()并计算组大小:

# only include a value if their consecutive count is odd
[v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]

然后重复,直到序列的大小不再变化:

prev = len(sequence) + 1
while len(sequence) < prev:
    prev = len(sequence)
    sequence = [v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]

但是,由于Hackerrank为您提供 text ,如果您使用正则表达式执行此操作,它会更快:

import re

even = re.compile(r'(?:([a-z])\1)+')

prev = len(text) + 1
while len(text) < prev:
    prev = len(text)
    text = even.sub(r'', text)
正则表达式中的

[a-z]匹配小写字母,(..)groups that match, and \ 1 references the first match and will only match if that letter was repeated.(?:...)+ asks for repeats of the same two characters. re.sub ()`用空文本替换所有这些模式。

正则表达式方法足以通过Hackerrank挑战。

答案 1 :(得分:1)

您可以使用堆栈来实现 O(n)时间复杂度。迭代字符串中的字符,并为每个字符检查堆栈顶部是否包含相同的字符。如果它确实从堆栈弹出字符并移动到下一个项目。否则将角色推入堆栈。无论堆栈中剩余的是什么结果:

s = 'aaabccddd'
stack = []

for c in s:
    if stack and stack[-1] == c:
        stack.pop()
    else:
        stack.append(c)

print ''.join(stack) if stack else 'Empty String' # abd

更新根据讨论,我运行了几项测试,以测量输入长度为100的正则表达式和基于堆栈的解决方案的速度。测试在Windows 8上的Python 2.7上运行:

All same
Regex: 0.0563033799756
Stack: 0.267807865445
Nothing to remove
Regex: 0.075074750044
Stack: 0.183467329017
Worst case
Regex: 1.9983200193
Stack: 0.196362265609
Alphabet
Regex: 0.0759905517997
Stack: 0.182778728207

用于基准测试的代码:

import re
import timeit

def reduce_regexp(text):
    even = re.compile(r'(?:([a-z])\1)+')

    prev = len(text) + 1
    while len(text) < prev:
        prev = len(text)
        text = even.sub(r'', text)

    return text

def reduce_stack(s):
    stack = []

    for c in s:
        if stack and stack[-1] == c:
            stack.pop()
        else:
            stack.append(c)

    return ''.join(stack)


CASES = [
    ['All same', 'a' * 100],
    ['Nothing to remove', 'ab' * 50],
    ['Worst case', 'ab' * 25 + 'ba' * 25],
    ['Alphabet', ''.join([chr(ord('a') + i) for i in range(25)] * 4)]
]

for name, case in CASES:
    print(name)
    res = timeit.timeit('reduce_regexp(case)',
                        setup='from __main__ import reduce_regexp, case; import re',
                        number=10000)
    print('Regex: {}'.format(res))
    res = timeit.timeit('reduce_stack(case)',
                        setup='from __main__ import reduce_stack, case',
                        number=10000)
    print('Stack: {}'.format(res))