从日期列表中计算时间

时间:2018-08-06 20:54:45

标签: python python-3.x

我正在寻找一种最简单的解决方案来计算时间。我有一个下面的列表列表示例。我需要计算每天的结束时间-开始时间。例如。 2018-07-1 17:00-08:00 = 09:00我尝试了很多循环,并使用itertools.combinations进行了迭代,但始终失败。

[['2018-07-01', '8:00', 'IN'], 
['2018-07-01', '12:00', 'OUT'], 
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'], 
['2018-07-02', '8:00', 'IN'], 
['2018-07-02', '12:00', 'OUT'], 
['2018-07-02', '12:30', 'IN'], 
['2018-07-02', '17:00', 'OUT'], 
['2018-07-03', '8:00', 'IN'], 
['2018-07-03', '12:00', 'OUT'], 
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'], 
['2018-07-04', '17:00', 'OUT']]

我的尝试:

for idx, elemenet in enumerate(test):
    try:
        if elemenet[0] == test[idx + 1][0]:
            print(elemenet)
    except:
        pass

index = 0
for a, b in itertools.combinations(test, 2):
    if a[0] and b[0] and a[2] == 'IN' and b[2] == 'OUT':
        print(a , b)
        index += 1
print(index)

5 个答案:

答案 0 :(得分:1)

这是针对Python3使用itertools.groupby的解决方案

>>> lst = [['2018-07-01', '8:00', 'IN'], ['2018-07-01', '12:00', 'OUT'], ['2018-07-01', '12:30', 'IN'], ['2018-07-01', '17:00', 'OUT'], ['2018-07-02', '8:00', 'IN'], ['2018-07-02', '12:00', 'OUT'], ['2018-07-02', '12:30', 'IN'], ['2018-07-02', '17:00', 'OUT'], ['2018-07-03', '8:00', 'IN'], ['2018-07-03', '12:00', 'OUT'], ['2018-07-03', '12:30', 'IN'], ['2018-07-03', '17:00', 'OUT'], ['2018-07-04', '8:00', 'IN'], ['2018-07-04', '17:00', 'OUT']]
>>> 
>>> from datetime import datetime
>>> from itertools import groupby
>>> to_time = lambda s: datetime.strptime(s, '%H:%M')
>>> diff_time = lambda s1, s2: str(to_time(s1)-to_time(s2))
>>> 
>>> res = {date:diff_time(last[1], first[1]) for date,(first,*_,last) in groupby(lst, lambda x: x[0])}
>>> pprint(res)
{'2018-07-01': '9:00:00',
 '2018-07-02': '9:00:00',
 '2018-07-03': '9:00:00',
 '2018-07-04': '9:00:00'}

对于python2,您需要用这两行替换res =

>>> res = {date:list(times) for date,times in groupby(lst, lambda x: x[0])}
>>> res = {date:diff_time(times[-1][1], times[0][1]) for date,times in res.items()}

答案 1 :(得分:1)

dates = [['2018-07-01', '8:00', 'IN'], 
['2018-07-01', '12:00', 'OUT'], 
['2018-07-01', '12:30', 'IN'],
['2018-07-01', '17:00', 'OUT'], 
['2018-07-02', '8:00', 'IN'], 
['2018-07-02', '12:00', 'OUT'], 
['2018-07-02', '12:30', 'IN'], 
['2018-07-02', '17:00', 'OUT'], 
['2018-07-03', '8:00', 'IN'], 
['2018-07-03', '12:00', 'OUT'], 
['2018-07-03', '12:30', 'IN'],
['2018-07-03', '17:00', 'OUT'],
['2018-07-04', '8:00', 'IN'], 
['2018-07-04', '17:00', 'OUT']]

totalTime = dict()

for item in dates:
  date    = item[0]
  hr, min = item[1].split(':')
  time    = float(hr) * 60 + float(min)
  inout   = item[2]

  if not date in totalTime:
    totalTime[date] = 0

  if(inout == 'IN'):
    totalTime[date] -= time
  else:
    totalTime[date] += time

for date, time in totalTime.iteritems():
  print(date, time/60)

输出:

('2018-07-04', 9.0)
('2018-07-01', 8.5)
('2018-07-02', 8.5)
('2018-07-03', 8.5)

答案 2 :(得分:0)

似乎开始时间总是最早出现,而结束时间总是最晚出现。这就是您可以做的(请注意语法不太正确,因为自从我用python编程以来已经有一段时间了,但是您应该了解一般的想法)

i = 0
while i < len(list):
  j = list[i][0]
  time = list[i][2]
  i = 0
  for k in range(i, len(list)):
    if j == list[i+1][0]:
      i = i + 1
    else:
    time = list[i][2] - time #make sure your syntax here is correct

  i = i + 1

我还没有彻底考虑到这个问题,但是我认为它应该起作用,否则有人会纠正我的意思:)

答案 3 :(得分:0)

我假设您想要的是每天的最晚时间和每天的最早时间之间的差额?如果是这样,我认为pandas中的此解决方案应该有效:您只需按天分组,然后将前几个小时和最后几个小时相减(请注意,数据中的开始时间和结束时间始终为8和17;最好使用实际答案可变的数据对此进行测试。

import pandas as pd
df = pd.DataFrame(
    [['2018-07-01', '8:00', 'IN'], 
     ['2018-07-01', '12:00', 'OUT'], 
     ['2018-07-01', '12:30', 'IN'],
     ['2018-07-01', '17:00', 'OUT'], 
     ['2018-07-02', '8:00', 'IN'], 
     ['2018-07-02', '12:00', 'OUT'], 
     ['2018-07-02', '12:30', 'IN'], 
     ['2018-07-02', '17:00', 'OUT'], 
     ['2018-07-03', '8:00', 'IN'], 
     ['2018-07-03', '12:00', 'OUT'], 
     ['2018-07-03', '12:30', 'IN'],
     ['2018-07-03', '17:00', 'OUT'],
     ['2018-07-04', '8:00', 'IN'], 
     ['2018-07-04', '17:00', 'OUT']],
    columns=['date', 'hour', 'in_out']
)
df = df.drop(columns=['in_out'])  # don't need this
df.hour = pd.to_datetime(df.hour)

grouped_hours = df.groupby('date').hour
start_time = grouped_hours.apply(lambda group: group.sort_values().iloc[0])
end_time = grouped_hours.apply(lambda group: group.sort_values().iloc[-1])

end_time - start_time

答案 4 :(得分:0)

使用简单的python代码,这将完成....

from datetime import datetime
l=[['2018-07-01', '8:00', 'IN'], 
   ['2018-07-01', '12:00', 'OUT'], 
   ['2018-07-01', '12:30', 'IN'],
   ['2018-07-01', '17:00', 'OUT'], 
   ['2018-07-02', '8:00', 'IN'], 
   ['2018-07-02', '12:00', 'OUT'], 
   ['2018-07-02', '12:30', 'IN'], 
   ['2018-07-02', '17:00', 'OUT'], 
   ['2018-07-03', '8:00', 'IN'], 
   ['2018-07-03', '12:00', 'OUT'], 
   ['2018-07-03', '12:30', 'IN'],
   ['2018-07-03', '17:00', 'OUT'],
   ['2018-07-04', '8:00', 'IN'], 
   ['2018-07-04', '17:00', 'OUT']]


   def sortt(key1,key2):
     dt=key1.split('-')
     tt=key2.split(':')
     return datetime(int(dt[0]),int(dt[1]),int(dt[2]),int(tt[0]),int(tt[1]))




  sortedlist=sorted(l,key=lambda x: sortt(x[0],x[1]))

  currentDate=sortedlist[0][0]
  currentTime=sortedlist[0][1]
  for i in range(1,len(sortedlist)):
    if currentDate!=sortedlist[i][0] or i==len(sortedlist)-1:
      if i==len(sortedlist)-1:
          print(currentDate+' '+sortedlist[i-1][1]+'-'+currentTime)
          break
      else:
          print(currentDate+' '+currentTime+'-'+sortedlist[i-1][1])
      currentDate=sortedlist[i+1][0]
      currentTime=sortedlist[i+1][1]

输出:

2018-07-01 8:00-17:00

2018-07-02 12:00-17:00

2018-07-03 12:00-17:00

2018-07-04 8:00-17:00