如何在Python中从字符串中排除多个范围?

时间:2018-07-06 16:14:44

标签: python string range

我有以下字符串:

str = "AAbbbCddEE"

我也知道应该从字符串中排除的字母范围,这里是2:56:8。此示例的预期结果是字符串AACEE

可能还有两个以上的范围(或一个范围),范围也可以彼此重叠。假设从同一字符串中排除了范围2:56:84:9,我希望结果为AAE。如何在Python中执行此任务?

2 个答案:

答案 0 :(得分:4)

选项1
any enumerate

将您的范围保留在列表中,然后使用 any enumerate 来检查任何范围中是否包含索引:

>>> s = "AAbbbCddEE"
>>> ranges = [range(2,5), range(6,8), range(4,9)]
>>> ''.join([letter for idx, letter in enumerate(s) if not any(idx in rng for rng in ranges)])
'AAE'

选项2
使用差异 set 来确定要保留的索引...

r = set(range(len(s)))
for rng in ranges:
    r -= set(rng)
# {0, 1, 9}

...然后 join 并具有列表理解

>>> ''.join([letter for idx, letter in enumerate(s) if idx in r])
'AAE'

我强烈建议第二种方法。与必须检查每个元素的每个范围相比,计算初始集合所产生的开销仍然更加可取:

# Initial List

s = "AAbbbCddEE"
ranges = [range(2,5), range(6,8), range(4,9)]

%timeit ''.join([letter for idx, letter in enumerate(s) if not any(idx in rng for rng in ranges)])
8.38 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%%timeit
r = set(range(len(s)))
for rng in ranges:
    r -= set(rng)
''.join([letter for idx, letter in enumerate(s) if idx in r])

3.35 µs ± 59.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Much larger list

len(s)
100000

len(ranges)
300

%timeit ''.join([letter for idx, letter in enumerate(s) if not any(idx in rng for rng in ranges)])
3.25 s ± 13.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
r = set(range(len(s)))
for rng in ranges:
    r -= set(rng)
''.join([letter for idx, letter in enumerate(s) if idx in r])

18.8 ms ± 90.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

答案 1 :(得分:0)

您可以执行newstring = oldstring.replace(oldstring[a:b], ""),其中[a:b]是要排除的范围,如果您不想使用新的范围,可以使用oldstring代替newstring变量。

但是有一件事,如果您想从同一字符串中排除2:56:84:9,您是否可以不仅仅排除2:9,还是我错过了一些东西?