两个List <int>中常见的CONSECUTIVE项的所有子序列

时间:2017-07-21 09:59:30

标签: c# list

前几天我发了一个问题: 是否有算法在2 List&gt;中找到所有常见的,符合条件的子序列,保存订单? Yeldar Kurmangaliyev,非常有见地指出我真正需要的是: 两个列表的所有常见子序列。 由于我没有得到上一个问题的任何答案,我想知道是否存在一个有效的算法来查找: 两个列表中常见的CONSECUTIVE项的所有子序列 或者我必须尝试提出我能够做到的最佳算法。 提前谢谢。

1 个答案:

答案 0 :(得分:0)

自从我提出问题以来已经过了一段时间: 我想知道是否存在一种有效的算法来查找:两个列表中常见的,符合条件的所有子序列。 在所有这些时间里,没有人建议任何解决这个问题的算法,因此我得出结论,没有已知的算法存在,所以我自己开发了一个。 我决定在这里分享一个双重原因: 1.为了收到反馈。 没有正式的理论计算机科学教育,我将非常感谢任何有关算法效率和性能改进的建议。 2.希望将来可以帮助某人。

/// <summary>
    /// Problem statement:
    ///     Given two lists: list1 and list2, find all the sequences of common consecutive items in the two lists.
    /// 
    /// General considerations:
    ///     1. No constraints or limitations are imposed on the two lists, so the lists may contain any number of items, even a very large number of items, therefore the 
    ///        efficiency and performance of the algorithm are paramount.
    ///     2. No constraints or limitations are imposed on the items themselves, so the tow lists may contain any type of items: items of value type or items of reference 
    ///        type.
    ///        For items of reference type, the different references to the common items, stored in the two lists must point to the same memory address (to the same object) 
    ///        and not to two objects who are considered equal by virtue of some properties they have, which happen to have equal values at the moment.
    ///     3. The problem asks for SEQUENCES of items. In this context, a single item may be considered a sequence (an atrophied sequence of a single item).
    ///     4. The problem asks for sequences of COMMON items, so it makes sense to work only with the items contained in both the lists, ignoring all other items (the ones 
    ///        contained only on one of the lists and not contained in the other list) in order to improve the efficiency of the algorithm.
    ///     5. The problem asks for sequences of common CONSECUTIVE items, so the order of the items in of the lists is important. The only items of interest are those common 
    ///        items that form sequences of consecutive items, contained in BOTH the lists and not only in one of them (items that are consecutive in both the lists).
    ///        
    /// The algorithm:
    /// Step 1: Find all the common items of the 2 lists.
    ///         The most efficient way to find all the common items of 2 lists is their intersection:
    ///                  public static IEnumerable<TSource> Intersect<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second)
    ///         Enumerable.Intersect<TSource> Method: 
    ///         (IEnumerable<TSource>, IEnumerable<TSource>)
    ///         Produces the set intersection of two sequences by using the default equality comparer to compare values.
    ///         The result is an IEnumerable<T> whose distinct elements appear in both IEnumerable<TSource> that are intersected.
    /// Step 2: Find all the sequences of common consecutive items in list1.
    /// Step 3: Find all the sequences of common consecutive items in list2.
    /// Step 4: Find all the sequences of common consecutive items contained in both: list1 and in list2 (the common sequences).
    /// </summary>
    /// <param name="list1"></param>
    /// <param name="list2"></param>
    /// <returns></returns>
    public static List<List<T>> FindAllSequencesOfCommonConsecutiveItemsInTwoLists<T>(List<T> list1, List<T> list2)
    {
        // Step 1: Find all the common items of the 2 lists.
        HashSet<T> list1AsHashSet = new HashSet<T>(list1);
        HashSet<T> list2AsHashSet = new HashSet<T>(list2);
        list1AsHashSet.IntersectWith(list2AsHashSet);
        HashSet<T> allCommonItems = new HashSet<T>(list1AsHashSet);

        // Step 2: Find all the sequences of common consecutive items in list1.
        List<List<T>> sequencesOfCommonConsecutiveItemsInList1 = FindAllSequencesOfConsecutiveItemsInListFormedOnlyWithItemsContainedAlsoInSourceItems(list1, allCommonItems);
        // Step 3: Find all the sequences of common consecutive items in list2.
        List<List<T>> sequencesOfCommonConsecutiveItemsInList2 = FindAllSequencesOfConsecutiveItemsInListFormedOnlyWithItemsContainedAlsoInSourceItems(list2, allCommonItems);
        // Step 4: Find all the sequences of common consecutive items contained in both: list1 and in list2 (the common sequences).
        List<List<T>> commonSequencesOfConsecutiveItems = FindCommonSequencesInTwoListsOfLists(sequencesOfCommonConsecutiveItemsInList1, sequencesOfCommonConsecutiveItemsInList2);

        return commonSequencesOfConsecutiveItems;
    }

    public static List<List<T>> FindAllSequencesOfConsecutiveItemsInListFormedOnlyWithItemsContainedAlsoInSourceItems<T>(List<T> list, HashSet<T> sourceItems)
    {
        if (list == null || list.Count == 0 || sourceItems == null || sourceItems.Count == 0)
        {
            return null;
        }
        List<List<T>> allSequencesOFConsecutiveItemsInList = new List<List<T>>(); /* Holds all the sequences of consecutive items in the list, formed only of items that are also 
                                                                                   * contained in the HashSet sourceItems. 
                                                                                   */
        List<T> sequenceOfConsecutiveItems = null; // Holds a single sequence of consecutive items in the list, formed only of items that are also contained in the HashSet sourceItems.
        HashSet<T> uniqueSequenceStartingItems = new HashSet<T>(); /* This HashSet holds all the items that start a sequence of consecutive items. 
                                                                    * It makes sure that only a single instance of each sequence of items is saved in the List of Lists - no duplicate 
                                                                    * sequences: no two sequences containing the same items in the same order.
                                                                    */
        T currentItem;
        for (int i = 0; i < list.Count; i++)
        {
            sequenceOfConsecutiveItems = new List<T>();
            currentItem = list[i];
            T nextItem;
            if (sourceItems.Contains(currentItem))
            {
                if (uniqueSequenceStartingItems.Add(currentItem)) /* If it is possible to add currentItem to the HashSet uniqueSequenceStartingItems, it is the first occurrence of this 
                                                                   * specific item.
                                                                   */
                {
                    sequenceOfConsecutiveItems.Add(currentItem); /* Form a new list containing the new atrophied sequence of consecutive items formed of the single item currentItem, as 
                                                                  * it is this item's first occurrence and so this sequence is the first instance.
                                                                  */
                    allSequencesOFConsecutiveItemsInList.Add(sequenceOfConsecutiveItems); /* Add the new atrophied sequence of consecutive items formed of the single item currentItem 
                                                                                           * to the List of Lists allSequencesOFConsecutiveItemsInList, that will contain all the 
                                                                                           * sequences of consecutive items in the list, formed only of items that are also contained in 
                                                                                           * the HashSet sourceItems. 
                                                                                           */
                    sequenceOfConsecutiveItems = new List<T>(sequenceOfConsecutiveItems); /* Build a new instance of the list sequenceOfConsecutiveItems, already containing the single 
                                                                                           * item currentItem, which starts a new sequence of consecutive items in the list.
                                                                                           * This sequence will be added to the List of Lists allSequencesOFConsecutiveItemsInList only 
                                                                                           * if at least one additional consecutive item in the list will be found, and this item is 
                                                                                           * also contained in the HashSet sourceItems. 
                                                                                           */               
                    bool itemNotContainedInSourceItemsEncountered = false;
                    int j = i;
                    while (j < list.Count - 1 && !itemNotContainedInSourceItemsEncountered) /* Loop through all the adjacent items of currentItem in search of items that are also contained 
                                                                                             * in the HashSet sourceItems. 
                                                                                             */
                    {
                        if (j < list.Count - 1)
                        {
                            nextItem = list[j + 1];
                            if (sourceItems.Contains(nextItem)) // An additional consecutive item was found, which is also contained in the HashSet sourceItems. 
                            {
                                sequenceOfConsecutiveItems.Add(nextItem); // Add this item to the sequence that already contains at least one item.
                                allSequencesOFConsecutiveItemsInList.Add(sequenceOfConsecutiveItems); /* Add the new sequence of consecutive items to the List of 
                                                                                                       * Lists allSequencesOFConsecutiveItemsInList. 
                                                                                                       */
                                sequenceOfConsecutiveItems = new List<T>(sequenceOfConsecutiveItems); /* Build a new instance of the list sequenceOfConsecutiveItems, which already contains 
                                                                                                       * some consecutive items in the list where all these items are also contained in the 
                                                                                                       * HashSet sourceItems.
                                                                                                       */        
                            }
                            else /* Here the sequence of consecutive items in the list which are also contained in the HashSet sourceItems ends, as this item is not contained in the HashSet 
                                  * sourceItems.
                                  */
                            {
                                itemNotContainedInSourceItemsEncountered = true; 
                                sequenceOfConsecutiveItems = new List<T>(); // Create a new instance of the List sequenceOfConsecutiveItems, for the remaining items in the list.
                            }
                        }
                        j++;
                    }
                }
            }
        }

        return allSequencesOFConsecutiveItemsInList;
    }

    public static List<List<T>> FindCommonSequencesInTwoListsOfLists<T>(List<List<T>> sequencesOfItemsInList1, List<List<T>> sequencesOfItemsInList2)
    {
        List<List<T>> commonSequencesOfConsecutiveItems = new List<List<T>>();
        for (int i = 0; i < sequencesOfItemsInList1.Count; i++)
        {
            List<T> currentSequenceInList1 = sequencesOfItemsInList1[i];
            for (int j = 0; j < sequencesOfItemsInList2.Count; j++)
            {
                List<T> currentSequenceInList2 = sequencesOfItemsInList2[j];
                if (currentSequenceInList1.SequenceEqual(currentSequenceInList2))
                {
                    commonSequencesOfConsecutiveItems.Add(currentSequenceInList1);
                }
            }
        }

        return commonSequencesOfConsecutiveItems;
    }