组合列表中的离散和/或重叠时间序列

时间:2016-05-28 08:11:41

标签: python python-2.7 time set-intersection

我有2个有序时间列表,分别是ServiceA和ServiceB的开始/停止时间。我想将列表合并到一个列表中,其中包含至少有一个服务运行时的开始和停止时间(按顺序)。 (两种服务始终在00:01:00之后开始,在23:59:00之前停止。)

Example:

ListA = ["08:03:19","14:22:22","17:00:02","18:30:01"]
ListB = ["15:19:03","18:00:00","18:35:05","19:01:00"]
... the magic happens
Result =["08:03:19","14:22:22","15:19:03","18:30:01","18:35:05","19:01:00"]

以下代码无法产生预期效果。经过无数次尝试,占了大多数但不是所有可能的情况,我发布了我目前所拥有的。列表可能大不相同,例如,2个服务的开始/停止时间之间可能没有重叠或完全重叠。

#!/usr/bin/python2.7

def combineOverlappingTimes(aList, bList, CurrentResults):
    ReturnList = []
    aStart = aList[0]
    aStop = aList[1]
    bStart = bList[0]
    bStop = bList[1]

    if len(CurrentResults) == 0:
        LastTimeInCurrentResults = "00:00:00"
    else:
        LastTimeInCurrentResults = CurrentResults[(len(CurrentResults)-1)]

    print "aStart= %s\naStop= %s\nbStart= %s\nbStop= %s" % (aStart,aStop,bStart,bStop)
    print "LastTimeInCurrentResults= %s" % LastTimeInCurrentResults
    if aStart >= LastTimeInCurrentResults and bStart >= LastTimeInCurrentResults:
        if aStart > bStart:
            if bStart > aStop:
                ReturnList.append( (aStart,aStop) )
            elif bStart < aStop:
                ReturnList.append( (bStart,bStop ) )
        else: #(aStart < bStart)
            if aStop < bStart:
                ReturnList.append( (bStart,bStop) )
            elif aStop > bStop: 
                ReturnList.append( (bStart,aStop) )
    elif aStart >= LastTimeInCurrentResults:
        ReturnList.append( (aStart, aStop) )
    else: # either A or B is beforeLastTime
        if aStart < LastTimeInCurrentResults:
            ReturnList.append( (LastTimeInCurrentResults, aStop) )
        elif bStart < LastTimeInCurrentResults:
            ReturnList.append( (LastTimeInCurrentResults, bStop) )

    print ( "combineOverlappingTime ReturnList= " + str(ReturnList))
    print "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n\n"
    return ReturnList

# main()
#####################################################################
def main():

    ListA = ["08:03:19","14:22:22","14:22:25","14:22:30","18:00:02","18:30:01"]
    ListB = ["14:22:36","15:18:10","15:19:03","18:00:01","18:00:05","19:01:00"]
    ResultList = []

    i = 0
    while i < len(ListA):
        if i == 0:
            ListA_StartTime= ListA[i]
            ListA_StopTime = ListA[i+1]
        else:
            if i == len(ListA)-2:
                ListA_StartTime= ListA[i]
                ListA_StopTime = ListA[i+1]
            else:
                ListA_StartTime= ListA[i]
                ListA_StopTime = ListA[i+1]

        j = 0
        ListB_StartTime, ListB_StopTime = "",""
        for time in ListB:
            if j % 2 == 0:
                ListB_StartTime= time
            else:
                ListB_StopTime = time

            if ListB_StartTime!= "" and ListB_StopTime != "":
                tempSetA, tempSetB = [], []
                tempSetA.append(ListB_StartTime)
                tempSetA.append(ListB_StopTime)
                tempSetB.append(ListA_StartTime)
                tempSetB.append(ListA_StopTime)
                combinedTimes = combineOverlappingTimes(tempSetA, tempSetB, ResultList)
                for start,stop in combinedTimes:
                    ResultList.append(start)
                    ResultList.append(stop)
                ListB_StartTime, ListB_StopTime = "",""
            j += 1

        i += 2

    print "ResultList= %s \n\n" % str(ResultList)
    DesiredList = ["08:03:19","14:22:22","14:22:25","14:22:30","14:22:36","15:18:10","15:19:03","18:00:01","18:00:02","19:01:00"]
    print "Desired Results: %s" % str(DesiredList)


if __name__ == '__main__':
    main()

2 个答案:

答案 0 :(得分:1)

通过使用标准库中的itertools.reduce进行繁重的工作,您可以在没有单个for循环的情况下执行此操作。有些人认为这更加惯用(除了Guido当然不喜欢reduce函数,所以他选择从Python的前奏中删除它。)

from functools import reduce

# this would work with any comparable values in `aList` and `bList`
aList = [0, 3, 7, 10, 13, 14]
bList = [2, 4, 10, 11, 13, 15]

# split both lists into tuples of the form `(start, stop)`
aIntervals = list(zip(aList[::2], aList[1::2]))
bIntervals = list(zip(bList[::2], bList[1::2]))

# sort the joint list of intervals by start time
intervals = sorted(aIntervals + bIntervals)

# reduction function, `acc` is the current result, `v` is the next interval
def join(acc, v):
    # if an empty list, return the new interval
    if not acc:
        return [v]
    # pop the last interval from the list
    last = acc.pop()
    # if the intervals are disjoint, return both
    if v[0] > last[1]:
        return acc + [last, v]
    # otherwise, join them together
    return acc + [(last[0], max(last[1], v[1]))]

# this is an iterator with joined intervals...
joined_intervals = reduce(join, intervals, [])

# ... which we can join back into a single list of start/stop times
print(list(sum(joined_intervals, ())))

正如预期的那样,输出

[0, 4, 7, 11, 13, 15]

通过在提供的示例中使用时间值进行测试:

aList = ['08:03:19', '14:22:22', '17:00:02', '18:30:01']
bList = ['15:19:03', '18:00:00', '18:35:05', '19:01:00']

这也产生了期望的答案

['08:03:19', '14:22:22', '15:19:03', '18:30:01', '18:35:05', '19:01:00']

答案 1 :(得分:0)

打破问题

如果您在编码之前将其分解为函数和计划,那么您的代码可以更具可读性。

您的问题可以分解为较小的可管理部分:

  1. 创建一个函数,返回两个时间的总运行时间。
  2. 按升序排列aListbList中的时间
  3. 比较aListbList的前两个时间,以获得前两个时间的总正常运行时间。
  4. 4A。如果它返回4个时间,则给出的时间是不相交的。取最后两个时间并与aList中的后两个时间进行比较。

    4b中。如果没有,则时间连接。取两个时间并与接下来的两个时间进行比较,然后与aList中的后两个进行比较。

    1. 使用4的结果,以与4相似的方式比较bList中的时间。
    2. 重复直到列表的末尾。
    3. 在阅读完您的代码之后,问题似乎是它还不清楚需要做什么(一步一步)。

      解决问题

      1。创建一个函数,返回两个时间的总运行时间。

      时间安排将有这些可能的情况:

      NOT CONNECTED
      Case 1
      A: ---
      B:      -----
      Case 2
      A:      -----
      B: ---
      
      CONNECTION
      Connected A starts first
      Case 1
      A: -------
      B:   ---------
      Case 2
      A: -------
      B:   ---
      
      Connected B starts first
      Case 3
      A:   ---------
      B: -------
      Case 4
      A:   ---
      B: -------
      
      EQUALITY
      Starting Equality
      Case 1
      A: ---
      B: -----
      Case 2
      A: -----
      B: ---
      
      Ending Equality
      Case 3
      A: -----
      B:   ---
      Case 4
      A:   ---
      B: -----
      
      Total Equality
      Case 5
      A: -----
      B: -----
      

      考虑到这一点,您可以继续为其创建代码。

      def combined_uptime(a, b):
          # Only accept lists of length two
          aStart = a[0]
          aStop = a[1]
          bStart = b[0]
          bStop = b[1]
      
          # < means "earlier than"
          # > means "later than"
      
          # Unconnected
          # Not connected Case 1
          if aStop < bStart:
              return (aStart, aStop, bStart, bStop)
          # Not connected Case 2
          elif bStop < aStart:
              return (bStart, bStop, aStart, aStop)
      
          # From this point on, A and B are connected
      
          # CONNECTION
          # A starts first
          if aStart <= bStart:
              if aStop < bStop:
                  # Connection Case 1 + Equality Case 1
                  return (aStart, bStop)
              else:
                  # Connection Case 2 + Equality Case 2 + Equality Case 3 + Equality Case 5
                  return (aStart, aStop)
          else:
              if bStop < aStop:
                  # Connection Case 3
                  return (bStart, aStop)
              else:
                  # Connection Case 4 + Equality Case 4
                  return (bStart, bStop)
      

      2。按升序对aList和bList中的时序进行排序。

      这可以通过将所有时间转换为元组存储,排序,然后删除元组来完成。

      def sort_ascending(x):
          # Store each set in a tuple
          l = [(s, x[i*2 + 1]) for i, s in enumerate(x[::2])]
      
          # Sort in ascending order
          l.sort(key=lambda tup: tup[0])
          print l
          # Remove tuples
          ret_list = []
          [ret_list.extend(s) for s in l]
          return ret_list
      

      3-6。递归函数

      您会看到最后一步包括重复相同的过程。这通常暗示递归函数可以完成这项工作。

      使用上面提到的所有函数,这里是递归函数:

      def uptime(a, b, result=None):
          print a
          print b
          print result
          ret_list = []
      
          # Return the result if either a or b is empty
          if a == [] and b == []:
              return result
          elif a and b == []:
              return result.extend(a) or result[:]
          elif b and a == []:
              return result.extend(b) or result[:]
      
          # Prevent overwriting, make a copy
          aList = list(a)[:]
          bList = list(b)[:]
      
          # Get results from previous iteration
          if result:
              # Process aList
              results_aList = list(combined_uptime(aList[0:2], result))
              del aList[0:2]
      
              if len(results_aList) != 2:
                  ret_list.extend(results_aList[0:2])
                  del results_aList[0:2]
      
              # Process bList
              results_bList = list(combined_uptime(bList[0:2], results_aList))
              del bList[0:2]
      
              if len(results_bList) != 2:
                  ret_list.extend(results_bList[0:2])
                  del results_bList[0:2]
      
              print "CLEAR"
              # Add result
              ret_list.extend(uptime(aList, bList, results_bList))
          else:
              # First iteration
              results_aList_bList = list(combined_uptime(aList[0:2], bList[0:2]))
              del aList[0:2]
              del bList[0:2]
      
              if len(results_aList_bList) != 2:
                  # Disjoint
                  ret_list.extend(results_aList_bList[0:2])
                  del results_aList_bList[0:2]
              print "CLEAr"
              ret_list.extend(uptime(aList, bList, results_aList_bList))
          return ret_list
      

      在您在评论中提供的测试用例中,它将返回

      ["00:00:01","22:00:00", "22:00:00","23:58:59"]
      

      传递时

      ListA = ["00:00:01","22:00:00"]
      ListB = ["08:00:00", "09:00:00", "22:00:00","23:58:59"]