假设我有一根长度为0-5000的绳子。我想把这条绳子分开,以便切断下面列出的列表中的间隔,然后返回其余部分:
我的名单:
['HE670029', '4095', '4096']
['HE670029', '4098', '4099']
['HE670029', '4102', '4102']
所需的输出(不必是列表,可以在新行上写入每个list
的文件):
['HE670029', '0', '4094']
['HE670029', '4097', '4097']
['HE670029', '4100', '4101']
['HE670029', '4103', '5000']
我试过操纵字典,但没有成功。我不知道如何将其转换为允许我执行所需操作的格式。
答案 0 :(得分:1)
它不漂亮,但它有效:
sections_to_cut = [
['HE670029', '4095', '4096'],
['HE670029', '4098', '4099'],
['HE670029', '4102', '4102']
]
ropes = {}
for rope in sections_to_cut:
if rope[0] not in ropes: # could use default dict instead
ropes[rope[0]] = []
ropes[rope[0]].append((int(rope[1]), int(rope[2])))
cut_ropes = []
for rope_name, exclude_values in ropes.items():
sorted_ex = sorted(exclude_values, key=lambda x: x[0])
a = 0
for i in sorted_ex:
cut_ropes.append([rope_name, str(a), str(i[0]-1)])
a = i[1] + 1
cut_ropes.append([rope_name, str(a), str(5000)])
print(cut_ropes)
# [['HE670029', '0', '4094'], ['HE670029', '4097', '4097'], ['HE670029', '4100', '4101'], ['HE670029', '4103', '5000']]
答案 1 :(得分:0)
在我看到您的间隔不能重叠之前,我开始写这个。这种方法有点矫枉过正,但是我会把它放弃,因为丢掉它似乎很浪费。
有关简短解决方案,请参阅底部。
OOP-ish做事的方式:
class Interval:
def __init__(self,left,right):
self.left = int(left)
self.right = int(right)
def __contains__(self,x):
return self.left <= int(x) <= self.right
intervals = [['HE670029', '4095', '4096'],
['HE670029', '4098', '4099'],
['HE670029', '4102', '4102']]
#if intervals aren't sorted, then do:
#cuts = [Interval(*x[1:]) for x in sorted(intervals,key=lambda i: i[1])]
cuts = [Interval(*x[1:]) for x in intervals]
#this step is overkill, since we know our intervals can't overlap
breakpoints = [x for x in range(1,5000) if any(x in cut for cut in cuts)]
def gen_segments(breakpoints, id_='HE670029', start=0, end=5000 ):
for pair in chunks(breakpoints,2):
if len(pair) < 2: #last breakpoint may be singleton
pair += pair
left,right = pair
yield id_, start, left-1
start = right+1
yield id_, start, end
chunks
是this页面上的几个块食谱之一。演示:
list(gen_segments(breakpoints))
Out[258]:
[('HE670029', 0, 4094),
('HE670029', 4097, 4097),
('HE670029', 4100, 4101),
('HE670029', 4103, 5000)]
Interval
类或任何其他内容。就这样做:
breakpoints = [int(x) for interval in intervals for x in interval[1:]]
然后直接使用上面的gen_segments
。
答案 2 :(得分:0)
我不会为你破坏它,但给你一个暗示。给定
xs = [
['HE670029', '4095', '4096'],
['HE670029', '4098', '4099'],
['HE670029', '4102', '4102']]
第一部分和最后一部分很容易做到。只是0->第一个节点,然后最后一个节点是5000.你需要临时值......
首先创建可以提取绳子两端值的函数:
def head(x): return int(x[1])
def last(x): return int(x[-1])
现在您需要像以下那样对每个后续行进行细分:
[a,b for a,b in zip(xs[:-1], xs[1:])]
既然你拥有这些值,你可以继续使用你刚创建的函数来提取每个函数的最后和第一个值...
[(last(a),head(b)) for (a,b) in zip(xs[:-1], xs[1:])]
这些不是你想要的价值吗?你需要在这里转移......
[(last(a)+1,head(b)-1) for (a,b) in zip(xs[:-1], xs[1:])]
最后,只需将右侧列表放入:
xM = [['HE670029', str(last(a)+1),str(head(b)-1)] for (a,b) in zip(xs[:-1], xs[1:])]
现在您有2个列表。 xs
和xM
。我相信你可以循环并将它们组合在一起......如果你想改善结果,Ypu可以考虑使用zip
,list
和concat
。