元组的子集

时间:2013-11-09 10:02:39

标签: python python-3.x

我如何制作:

("A","b"),("b","C"),("A",),("b",),("C",)

("A","b","C")

或许可以使用itertool,我似乎无法找到适合的功能。

更新

请注意(“A”,“C”)不在预期的输出中,因为我希望子集包含彼此连续的成员。

另一个例子:

subset(("A","b","C","D"))

应该产生:

("A","b","C"),
("b","C","D"),
("A","b"),
("b","C"),
("C","D"),
("A",),
("b",),
("C",),
("D",)

3 个答案:

答案 0 :(得分:5)

您可以使用滚动窗口配方:

def window(seq, n=2):
    """
    Returns a sliding window (of width n) over data from the sequence
    s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
    """
    for i in range(len(seq)-n+1):
        yield tuple(seq[i:i+n])

def shrinking_window(seq):
    for i in range(len(seq)-1, 0, -1):
        yield from window(seq, i)

print(list(shrinking_window('AbC')))
# [('A', 'b'), ('b', 'C'), ('A',), ('b',), ('C',)]
print(list(shrinking_window('AbCD')))
# [('A', 'b', 'C'), ('b', 'C', 'D'), ('A', 'b'), ('b', 'C'), ('C', 'D'), ('A',), ('b',), ('C',), ('D',)]

答案 1 :(得分:2)

这是一种方式:

input_tuple = ("A", "b", "C", "d")
output_tuples = []
for subtuple_length in reversed(xrange(1, len(input_tuple))):
    for start_index in xrange(0, (len(input_tuple) + 1 - subtuple_length)):
        output_tuples.append(input_tuple[start_index:start_index + subtuple_length])

构建连续子元素的列表 - 您也可以print,或者yield将它们作为生成器或其他任何元素。它是输入元组长度的二次方,但是你的预期结果集的大小也是如此,所以我不确定是否有办法解决这个问题。

答案 2 :(得分:2)

subset(("A","b","C","D"))
     

应该产生:

("A","b","C"), 
("b","C","D"),
("A","b"),
("b","C"),
("C","D"),
("A",),
("b",),
("C",),
("D",)

滑动窗户很难。迭代地缩小或增长窗口是双倍的。

首先列出要解决的步骤,然后创建一个遵循这些步骤的函数:

  1. 从最大窗口大小开始(比示例代码中的总长度少一个)。
  2. 然后计算覆盖数据集所需的窗口数。
  3. 然后对于每个窗口,您可以重复使用该数字作为起始索引,并且需要将起始索引添加到窗口大小以确定每个窗口停止的位置:
  4. 结果函数:

    def subset(data):
        total_length = len(data)
        for window_length in range(total_length - 1, 0, -1): # biggest first
            n_windows = total_length - window_length + 1
            for each_window in range(n_windows):
                start = each_window
                stop = start + window_length
                yield data[start:stop]
    

    示例数据:

    data = ("A","b","C","D")
    

    现在,在subset上调用data会返回一个生成器,如果我们传递给list,则会生成结果:

    >>> subset(data)
    <generator object subset at 0x7fbc3d7f3570>
    >>> list(subset(data))
    [('A', 'b', 'C'), ('b', 'C', 'D'), ('A', 'b'), ('b', 'C'), ('C', 'D'), ('A',), ('b',), ('C',), ('D',)]
    

    Deque解决方案:

    我对使用deque(来自集合模块)滚动窗口的想法很着迷,并决定证明这一点:

    import collections
    import pprint
    
    def shrinking_windows(iterable):
        '''
        Given an ordered iterable (meaningless for unordered ones)
        return a list of tuples representing each possible set
        of consecutive items from the original list. e.g.
        shrinking_windows(['A', 'b', 'c']) returns 
        [('A', 'b', 'c'), ('A', 'b'), ('b', 'c') ...] but not ('A', 'c')
        '''
        window_generator = range(len(iterable), 0, -1)
        results = []
        for window in window_generator:
            d = collections.deque((), maxlen=window)
            for i in iterable:
                d.append(i)
                if len(d) == window:
                    results.append(tuple(d))
        return results
    
    pprint.pprint(shrinking_windows('AbCd'))
    

    很好地返回:

    [('A', 'b', 'C', 'd'),
     ('A', 'b', 'C'),
     ('b', 'C', 'd'),
     ('A', 'b'),
     ('b', 'C'),
     ('C', 'd'),
     ('A',),
     ('b',),
     ('C',),
     ('d',)]