我想要的两个清单
A = [ 1,2,3,4,5]
B = [4,5,6,7]
结果 C = [1,2,3,4,5,6,7]
如果我指定重叠2。
到目前为止代码:
concat_list = []
word_overlap = 2
for lst in [lst1, lst2, lst3]:
if (len(concat_list) != 0):
if (concat_list[-word_overlap:] != lst[:word_overlap]):
concat_list += lst
elif ([concat_list[-word_overlap:]] == lst[:word_overlap]):
raise SystemExit
else:
concat_list += lst
为字符串列表执行此操作,但应该是相同的。
编辑:
我想让我的代码做的是,首先检查是否有任何重叠(1,2等),然后连接列表,消除重叠(所以我不会得到双重元素)。
[1,2,3,4,5] + [4,5,6,7] = [1,2,3,4,5,6,7]
但
[1,2,3] + [4,5,6] = [1,2,3,4,5,6]
我希望它还检查小于我设置的word_overlap的任何重叠。
答案 0 :(得分:1)
您可以使用set和union
s.union(t):包含s和t
元素的新集合
>> list(set(A) | set(B))
[1, 2, 3, 4, 5, 6, 7]
但是你不能用这种方式重叠确切的数字。
要回答你的问题,你将不得不诡计并使用集合的组合:
使用切片
获取包含A或B中的元素的新列表,但不是两者都
OVERLAP = 1
A = [1, 2, 3, 4, 5]
B = [4, 5, 6, 7]
C = list(set(A) | set(B)) # [1, 2, 3, 4, 5, 6, 7]
D = list(set(A) & set(B)) # [4, 5]
D = D[OVERLAP:] # [5]
print list(set(C) ^ set(D)) # [1, 2, 3, 4, 6, 7]
只是为了好玩,一个单行可以给出这个:
list((set(A) | set(B)) ^ set(list(set(A) & set(B))[OVERLAP:])) # [1, 2, 3, 4, 6, 7]
OVERLAP
是你需要团聚的常数。
答案 1 :(得分:1)
这是一个天真的变体:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,-1,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
break # Found an overlap, don't check any shorter
return a+b[overlap:]
使用引用支持切片的类型(如缓冲区或numpy数组)会更有效。
这样做的一个奇怪的事情是,在达到overlap = 0时,它将a(切片,它是列表的副本)的整体与b的空切片进行比较。除非它们是空的,否则该比较将失败,但仍然会使overlap = 0,因此返回值是正确的。我们可以通过略微改写来专门处理这个案例:
def concat_nooverlap(a,b):
maxoverlap=min(len(a),len(b))
for overlap in range(maxoverlap,0,-1):
# Check for longest possible overlap first
if a[-overlap:]==b[:overlap]:
return a+b[overlap:]
else:
return a+b
答案 2 :(得分:0)
不确定我是否正确解释了您的问题,但您可以这样做:
A = [ 1,2,3,4,5]
B = [4,5,6,7]
overlap = 2
print A[0:-overlap] + B
如果您想确保它们具有相同的值,您的检查可以是:
if(A[-overlap:] == B[:overlap]):
print A[0:-overlap] + B
else:
print "error"
答案 3 :(得分:0)
假设两个列表都是连续的,列表a的值总是小于列表b。我想出了这个解决方案。 这也可以帮助您检测重叠。
def concatenate_list(a,b):
max_a = a[len(a)-1]
min_b = b[0]
if max_a >= min_b:
print 'overlap exists'
b = b[(max_a - min_b) + 1:]
else:
print 'no overlap'
return a + b
对于字符串,您也可以这样做
def concatenate_list_strings(a,b):
count = 0
for i in xrange(min(len(a),len(b))):
max_a = a[len(a) - 1 - count:]
min_b = b[0:count+1]
if max_a == min_b:
b = b[count +1:]
return 'overlap count ' + str(count), a+b
count += 1
return a + b