Question

我正在努力寻找解决以下问题的最佳方法。最好的方式我的意思是不那么复杂。

作为输入的元组列表（开始，长度）如下：

[(0,5),(0,1),(1,9),(5,5),(5,7),(10,1)]

每个元素通过 start 和 length 来表示序列，例如（5,7）等同于序列(5,6,7,8,9,10,11) - 列表7从5开始的元素。可以假设元组按start元素排序。

输出应返回表示最长连续序列的非重叠元组组合。这意味着，解决方案是范围的子集，没有重叠且没有间隙，并且是最长的 - 尽管可能不止一个。

例如，对于给定的输入，解决方案是：

[(0,5),(5,7)]相当于(0,1,2,3,4,5,6,7,8,9,10,11)

是否回溯了解决此问题的最佳方法？

我对人们可以建议的任何不同方法感兴趣。

此外，如果有人知道这个问题的正式引用或另一个类似的问题，我想获得参考。

BTW - 这不是家庭作业。

修改

为了避免一些错误，这是预期行为的另一个例子

对于像[(0,1),(1,7),(3,20),(8,5)]这样的输入，正确的答案是[(3,20)]相当于（3,4,5，...，22），长度为20.有些答案会给{{1}相当于（0,1,2，...，11,12）作为正确答案。但最后一个答案不正确，因为它比[(0,1),(1,7),(8,5)]短。

Answer 1

使用给定的排序（通过start元素）迭代元组列表，同时使用hashmap跟踪某个索引上最长连续序列结束的长度。

伪代码，跳过详细信息，例如在散列映射中找不到的项目（假设未找到则返回0）：

int bestEnd = 0;
hashmap<int,int> seq // seq[key] = length of the longest sequence ending on key-1, or 0 if not found
foreach (tuple in orderedTuples) {
    int seqLength = seq[tuple.start] + tuple.length
    int tupleEnd = tuple.start+tuple.length;
    seq[tupleEnd] = max(seq[tupleEnd], seqLength)
    if (seqLength > seq[bestEnd]) bestEnd = tupleEnd
}
return new tuple(bestEnd-seq[bestEnd], seq[bestEnd])

这是一种O（N）算法。

如果你需要组成这个序列的实际元组，你需要保留一个由末尾索引散列的元组链表，每当更新这个终点的最大长度时更新它。

更新：我对python的了解相当有限，但是基于你粘贴的python代码，我创建了这个代码，它返回实际的序列，而不仅仅是长度：

def get_longest(arr):
    bestEnd = 0;
    seqLengths = dict() #seqLengths[key] = length of the longest sequence ending on key-1, or 0 if not found
    seqTuples = dict() #seqTuples[key] = the last tuple used in this longest sequence
    for t in arr:
        seqLength = seqLengths.get(t[0],0) + t[1]
        tupleEnd = t[0] + t[1]
        if (seqLength > seqLengths.get(tupleEnd,0)):
            seqLengths[tupleEnd] = seqLength
            seqTuples[tupleEnd] = t
            if seqLength > seqLengths.get(bestEnd,0):
                bestEnd = tupleEnd
    longestSeq = []
    while (bestEnd in seqTuples):
        longestSeq.append(seqTuples[bestEnd])
        bestEnd -= seqTuples[bestEnd][1]
    longestSeq.reverse()
    return longestSeq


if __name__ == "__main__":
    a = [(0,3),(1,4),(1,1),(1,8),(5,2),(5,5),(5,6),(10,2)]
    print(get_longest(a))

Answer 2

修改算法：

create a hashtable of start->list of tuples that start there
put all tuples in a queue of tupleSets
set the longestTupleSet to the first tuple
while the queue is not empty
    take tupleSet from the queue
    if any tuples start where the tupleSet ends
        foreach tuple that starts where the tupleSet ends
            enqueue new tupleSet of tupleSet + tuple
        continue

    if tupleSet is longer than longestTupleSet
        replace longestTupleSet with tupleSet

return longestTupleSet

c＃implementation

public static IList<Pair<int, int>> FindLongestNonOverlappingRangeSet(IList<Pair<int, int>> input)
{
    var rangeStarts = input.ToLookup(x => x.First, x => x);
    var adjacentTuples = new Queue<List<Pair<int, int>>>(
        input.Select(x => new List<Pair<int, int>>
            {
                x
            }));

    var longest = new List<Pair<int, int>>
        {
            input[0]
        };
    int longestLength = input[0].Second - input[0].First;

    while (adjacentTuples.Count > 0)
    {
        var tupleSet = adjacentTuples.Dequeue();
        var last = tupleSet.Last();
        int end = last.First + last.Second;
        var sameStart = rangeStarts[end];
        if (sameStart.Any())
        {
            foreach (var nextTuple in sameStart)
            {
                adjacentTuples.Enqueue(tupleSet.Concat(new[] { nextTuple }).ToList());
            }
            continue;
        }
        int length = end - tupleSet.First().First;
        if (length > longestLength)
        {
            longestLength = length;
            longest = tupleSet;
        }
    }

    return longest;
}

测试：

[Test]
public void Given_the_first_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 5),
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 9),
            new Pair<int, int>(5, 5),
            new Pair<int, int>(5, 7),
            new Pair<int, int>(10, 1)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(2);
    result.First().ShouldBeSameInstanceAs(input[0]);
    result.Last().ShouldBeSameInstanceAs(input[4]);
}

[Test]
public void Given_the_second_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 7),
            new Pair<int, int>(3, 20),
            new Pair<int, int>(8, 5)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(1);
    result.First().ShouldBeSameInstanceAs(input[2]);
}

Answer 3

这是longest path problem for weighted directed acyclic graphs的一个特例。

图中的节点是序列中最后一个元素的起点和点，下一个序列可以开始。

问题很特殊，因为两个节点之间的距离必须与路径无关。

Answer 4

我删除了之前的解决方案，因为它未经过测试。

问题是找到“加权有向无环图”中最长的路径，它可以在线性时间内求解：

http://en.wikipedia.org/wiki/Longest_path_problem#Weighted_directed_acyclic_graphs

将一组{start positions} union {（起始位置+结束位置）}作为顶点。对于你的例子，它将是{0,1,5,10,11,12}

对于顶点v0，v1，如果存在使v0 + w = v1的结束值w，则添加将v0连接到v1的有向边，并将w作为其权重。

现在按照维基百科页面中的伪代码进行操作。由于顶点数是最大值2xn（n是元组数），问题仍然可以在线性时间内解决。

Answer 5

编辑用实际Python代码替换伪代码

编辑AGAIN以更改代码;最初的算法是解决方案，但我很想念对中的第二个值是什么！幸运的是，基本算法是相同的，我能够改变它。

这是一个解决O（N log N）中的问题并且不使用哈希映射（因此没有隐藏时间）的想法。对于记忆，我们将使用N * 2“事物”。

我们将为每个元组添加两个值：（BackCount，BackLink）。在成功的组合中，BackLink将从最右边的元组从右到左链接到最左边的元组。 BackCount将是给定BackLink的值累计计数。

这是一些python代码：

def FindTuplesStartingWith(tuples, frm):
    # The Log(N) algorithm is left as an excersise for the user
    ret=[]
    for i in range(len(tuples)):
        if (tuples[i][0]==frm): ret.append(i)
    return ret

def FindLongestSequence(tuples):

    # Prepare (BackCount, BackLink) array
    bb=[] # (BackCount, BackLink)
    for OneTuple in tuples: bb.append((-1,-1))

    # Prepare
    LongestSequenceLen=-1
    LongestSequenceTail=-1

    # Algorithm
    for i in range(len(tuples)):
        if (bb[i][0] == -1): bb[i] = (0, bb[i][1])
        # Is this single pair the longest possible pair all by itself?
        if (tuples[i][1] + bb[i][0]) > LongestSequenceLen:
            LongestSequenceLen = tuples[i][1] + bb[i][0]
            LongestSequenceTail = i
        # Find next segment
        for j in FindTuplesStartingWith(tuples, tuples[i][0] + tuples[i][1]):
            if ((bb[j][0] == -1) or (bb[j][0] < (bb[i][0] + tuples[i][1]))):
                # can be linked
                bb[j] = (bb[i][0] + tuples[i][1], i)
                if ((bb[j][0] + tuples[j][1]) > LongestSequenceLen):
                    LongestSequenceLen = bb[j][0] + tuples[j][1]
                    LongestSequenceTail=j

    # Done! I'll now build up the solution
    ret=[]
    while (LongestSequenceTail > -1):
        ret.insert(0, tuples[LongestSequenceTail])
        LongestSequenceTail = bb[LongestSequenceTail][1]
    return ret

# Call the algoritm
print FindLongestSequence([(0,5), (0,1), (1,9), (5,5), (5,7), (10,1)])
>>>>>> [(0, 5), (5, 7)]
print FindLongestSequence([(0,1), (1,7), (3,20), (8,5)])    
>>>>>> [(3, 20)]

整个算法的关键在于代码中的“这是关键”注释。我们知道我们当前的StartTuple可以链接到EndTuple。如果存在以EndTuple.To结尾的较长序列，则在我们到达这一点时发现它，因为它必须从较小的StartTuple.From开始，并且数组在“From”上排序！

Answer 6

这是一个简单的减少操作。给定一对连续的元组，它们可以组合也可以不组合。因此，定义成对组合函数：

def combo(first,second):
    if first[0]+first[1] == second[0]:
        return [(first[0],first[1]+second[1])]
    else:
        return [first,second]

这只返回组合两个参数的一个元素的列表，或原始的两个元素。

然后定义一个函数来迭代第一个列表并组合对：

def collapse(tupleList):
    first = tupleList.pop(0)
    newList = []
    for item in tupleList:
        collapsed = combo(first,item)
        if len(collapsed)==2:
            newList.append(collapsed[0])
        first = collapsed.pop()
    newList.append(first)
    return newList

这将保留第一个元素与列表中的当前项目进行比较（从第二个项目开始），当它无法组合它们时，它会将第一个元素放入一个新列表中，并将first替换为两个中的第二个。

然后只需使用元组列表调用collapse：

>>> collapse( [(5, 7), (12, 3), (0, 5), (0, 7), (7, 2), (9, 3)] )
[(5, 10), (0, 5), (0, 12)]

[编辑]最后，迭代结果以获得最长的序列。

def longest(seqs):
    collapsed = collapse(seqs)
    return max(collapsed, key=lambda x: x[1])

[/编辑]

复杂度O（N）。对于奖励标记，请反向执行，以便初始pop(0)变为pop()并且您不必重新索引数组，或者移动迭代器。对于最高分，使其作为多线程优点的成对reduce运算。

Answer 7

只要用基本术语思考算法，这会有用吗？

（为可怕的语法道歉，但我想在这里保持与语言无关）

首先是最简单的形式：找到最长的连续对。

循环每个成员并将其与具有更高startpos的每个其他成员进行比较。如果第二个成员的startpos等于第一个成员的startpos和length的总和，则它们是连续的。如果是这样，则在具有较低startpos和组合长度的新集合中形成新成员以表示此成员。

然后，取出这些对中的每一对并将它们与具有更高起点的所有单个成员进行比较并重复，形成一组新的连续三元组（如果存在）。

继续此模式，直到没有新的设置。

然后，棘手的部分是你必须比较每个集合中每个成员的长度才能找到真正最长的链。

我很确定这不如其他方法有效，但我相信这是一种可行的方法来强制推行这个解决方案。

我很感激对此的反馈以及我可能忽略的任何错误。

Answer 8

这听起来像是一个完美的“动态编程”问题......

最简单的程序是做暴力（例如递归），但这具有指数复杂性。

使用动态编程，您可以设置长度为n的数组a，其中n是问题的所有（起始+长度）值的最大值，其中a [i]表示最长的非重叠序列，直到[一世]。然后你可以步进所有元组，更新一个。该算法的复杂度为O（n * k），其中k是输入值的数量。

Answer 9

创建一个包含所有起点和终点的有序数组，并将它们全部初始化为一个
对于元组中的每个项目，将结束点（开始和结束）与数组中的有序项进行比较，如果它们之间有任何点（例如，数组中的点为5，则表示长度为4的开始2）将值更改为零。
完成循环后，开始在有序数组中移动并在看到1时创建条带，当您看到1时，添加到现有条带，任意零，关闭条带等。
最后检查条的长度

我认为复杂性在O（4-5 * N）附近

（见更新）

，N是元组中的项目数。

<强>更新

正如您所知，复杂性并不准确，但绝对非常小，因为它是行数（行元组）的函数。

因此，如果N是线条延伸的数量，则排序为O（2N * log2N）。比较是O（2N）。寻找线延伸也是O（2N）。总而言之 O（2N（log2N + 2））。

算法找到最长的非重叠序列

9 个答案:

（见更新）