给出list_a和list_b。我想通过一个函数运行list_b,该函数给出list_b的所有可能的子列表(这部分代码可以工作)。然后我想获取list_b的每个子列表,并查看该子列表是否也是list_a的子列表。如果是,我应该得到列表中的所有索引或拼接列表,其中该子列表出现在list_a中。
我能够让代码适用于长度为1的子列表,但无法让它适用于更长的列表。
这是我目前解决此问题的代码:
import numpy as np
a = [0,1,2,3,0,2,3]
b = [0,2,3]
sublists = []
def all_sublists(my_list):
""" make a list containg every sublist of a my_list"""
for i in range(len(my_list)):
n = i+1
while n <= len(my_list):
sublist = my_list[i:n]
sublists.append(sublist)
n += 1
def sublists_splice(sublist, my_list):
"""if sublist is in my_list print sublist and the corresponding indexes"""
values = np.array(my_list)
print(str(sublist) + " found at " + str(np.where(values == sublist)[0]))
all_sublists(b)
for sublist in sublists:
sublists_splice(sublist, a)
这是代码的输出:
[0] found at [0 4]
[0, 2] found at []
[0, 2, 3] found at []
[2] found at [2 5]
[2, 3] found at []
[3] found at [3 6]
/home/nicholas/Desktop/sublists.py:27: DeprecationWarning: elementwise == comparison failed; this will raise an error in the future.
这是我想要的:
[0] found at [0 4]
[0, 2] found at [4:6]
[0, 2, 3] found at [4:7]
[2] found at [2 5]
[2, 3] found at [2:4 5:7]
[3] found at [3 6]
我假设有一种pythonic方式来解决这个问题。虽然我已经尝试了一些代码,但它们都很长并且没有工作......
最后一点。我确实需要它们作为订单重要的子列表而不是子集。
我感谢任何帮助。谢谢。
答案 0 :(得分:1)
这是一个使用itertools.combinations
的解决方案。注意我已尽可能将其作为 lazy ,但这并不意味着它是最有效的解决方案。
from itertools import combinations
import numpy as np
a = [0,1,2,3,0,2,3]
b = [0,2,3]
def get_combs(x):
return (list(c) for i in range(1, len(x)) for c in combinations(x, i))
def get_index_arr(x, y):
n = len(x)
lst = (y[i:i+n] for i in range(len(y)-len(x)+1))
return (i for i, j in enumerate(lst) if x == j)
combs = get_combs(b)
d = {tuple(c): list(get_index_arr(c, a)) for c in combs}
# {(0,): [0, 4],
# (0, 2): [4],
# (0, 3): [],
# (2,): [2, 5],
# (2, 3): [2, 5],
# (3,): [3, 6]}
答案 1 :(得分:1)
使用Find boolean mask by pattern
中的工具def rolling_window(a, window): #https://stackoverflow.com/q/7100242/2901002
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
c = np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
return c
def vview(a): #based on @jaime's answer: https://stackoverflow.com/a/16973510/4427777
return np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1])))
def sublist_loc(a, b):
a, b = np.array(a), np.array(b)
n = min(len(b), len(a))
sublists = [rolling_window(b, i) for i in range(1, n + 1)]
target_lists = [rolling_window(a, i) for i in range(1, n + 1)]
locs = [[np.flatnonzero(vview(target_lists[i]) == s) for s in vview(subl)] \
for i, subl in enumerate(sublists)]
for i in range (n):
for j, k in enumerate(sublists[i]):
print(str(k) + " found starting at index " + str(locs[i][j]))
return sublists, target_lists, locs
_ = sublist_loc(a, b)
[0] found starting at index [0 4]
[2] found starting at index [2 5]
[3] found starting at index [3 6]
[0 2] found starting at index [4]
[2 3] found starting at index [2 5]
[0 2 3] found starting at index [4]
作为一个额外的好处,所有rolling_window
和vview
调用只是对原始数组的视图,因此存储组合没有大的内存命中。