Python:Max&每小时最低值

时间:2015-04-26 14:42:08

标签: python arrays datetime max min

我有一个2x720的数组。第一列是datetime,第二列是value。我的数据如下: -

[(datetime.datetime(2015,4,26,0,10),25.2),
(datetime.datetime(2015,4,26,0,20),25.1),
(datetime.datetime(2015,4,26,0,30),25.7),
(datetime.datetime(2015,4,26,0,40),23.2),
(datetime.datetime(2015,4,26,0,50),22.2),
(datetime.datetime(2015,4,26,0,60),29.2),
(datetime.datetime(2015,4,26,1,00),22.2),
(datetime.datetime(2015,4,26,1,10),21.2), ...]

所有数据都在同一天。我只是想按小时组织数据来准备烛台中的情节(只有最大,最小,不想打开,关闭)。我只想要这样的数据: -

[(datetime.datetime(2015,4,26,0,00),max in hour 0, min in hour 0),
(datetime.datetime(2015,4,26,1,00),max in hour 1, min in hour 1),    
(datetime.datetime(2015,4,26,2,00),max in hour 2, min in hour 2), ...
(datetime.datetime(2015,4,26,23,00),max in hour 23, min in hour 23)]

我是一个新的Python,想要使用漂亮的短脚本。以前,我使用C ++(很久很久以前),我发现Python更多的是艺术,而不仅仅是编程。我尝试搜索答案一段时间但找不到符合我要求的答案。谢谢你的帮助。

4 个答案:

答案 0 :(得分:0)

由于他们都是在同一天,按小时分组,然后描述这些群组。

 import datetime
 from collections import defaultdict     

 start_of_day = datetime.datetime(2015, 4, 26)

 hour_to_values = defaultdict(list)
 for dt, value in your_list_of_values:
      hour_to_values[dt.hour].append(value)

 result = [(start_of_day + datetime.timedelta(hours=hour),
            min(values), max(values))
           for hour, values in hour_to_values.iteritems()]

答案 1 :(得分:0)

以下假设列表已经按日期排序

output = []
current_hour = None
current_output = None
for point in data:
    phour = point[0].hour
    pvalue = point[1]
    if phour is current_hour:
        if pvalue < current_output[1]:
            current_output[1] = pvalue
        if pvalue > current_output[2]:
            current_output[2] = pvalue
    else:
        current_hour = phour
        output.append([point[0], pvalue, pvalue])
        current_output = output[-1]

答案 2 :(得分:0)

如果您的数据是这样的:

>>> arr
[[datetime.datetime(2015, 4, 26, 0, 0), 0.9627101684867109], [datetime.datetime(2015, 4, 26, 0, 20), 0.8894632247614254], [datetime.datetime(2015, 4, 26, 0, 40), 0.1920554638586589], [datetime.datetime(2015, 4, 26, 1, 0), 0.24394390686092926], [datetime.datetime(2015, 4, 26, 1, 20), 0.9870880292994234], [datetime.datetime(2015, 4, 26, 1, 40), 0.8154734773666351], [datetime.datetime(2015, 4, 26, 2, 0), 0.5074101780070644], [datetime.datetime(2015, 4, 26, 2, 20), 0.6211085118418351], [datetime.datetime(2015, 4, 26, 2, 40), 0.1309246438480619], [datetime.datetime(2015, 4, 26, 3, 0), 0.2042948575387714], [datetime.datetime(2015, 4, 26, 3, 20), 0.90969148583095], [datetime.datetime(2015, 4, 26, 3, 40), 0.9260473796075621], [datetime.datetime(2015, 4, 26, 4, 0), 0.08180604335801178], [datetime.datetime(2015, 4, 26, 4, 20), 0.9909948477818202], [datetime.datetime(2015, 4, 26, 4, 40), 0.6306008554115328], [datetime.datetime(2015, 4, 26, 5, 0), 0.7218791510465083], [datetime.datetime(2015, 4, 26, 5, 20), 0.5751211758007434], [datetime.datetime(2015, 4, 26, 5, 40), 0.8643323785674638], [datetime.datetime(2015, 4, 26, 6, 0), 0.44366887986412196], [datetime.datetime(2015, 4, 26, 6, 20), 0.5845914793227223], [datetime.datetime(2015, 4, 26, 6, 40), 0.9816449110831348], [datetime.datetime(2015, 4, 26, 7, 0), 0.7976769524401801], [datetime.datetime(2015, 4, 26, 7, 20), 0.019715644725192494], [datetime.datetime(2015, 4, 26, 7, 40), 0.774857573501942], [datetime.datetime(2015, 4, 26, 8, 0), 0.971010849289862], [datetime.datetime(2015, 4, 26, 8, 20), 0.9854650056341737], [datetime.datetime(2015, 4, 26, 8, 40), 0.44764478642480565], [datetime.datetime(2015, 4, 26, 9, 0), 0.41757419665518836], [datetime.datetime(2015, 4, 26, 9, 20), 0.2428205990660569], [datetime.datetime(2015, 4, 26, 9, 40), 0.7652296383460859], [datetime.datetime(2015, 4, 26, 10, 0), 0.6148904798625167], [datetime.datetime(2015, 4, 26, 10, 20), 0.5437523646936837], [datetime.datetime(2015, 4, 26, 10, 40), 0.7867821039231312], [datetime.datetime(2015, 4, 26, 11, 0), 0.7178834338473005], [datetime.datetime(2015, 4, 26, 11, 20), 0.4349509857268635], [datetime.datetime(2015, 4, 26, 11, 40), 0.2819549901100772], [datetime.datetime(2015, 4, 26, 12, 0), 0.0849398640248602], [datetime.datetime(2015, 4, 26, 12, 20), 0.6260259998494316], [datetime.datetime(2015, 4, 26, 12, 40), 0.8353818765863841], [datetime.datetime(2015, 4, 26, 13, 0), 0.17232607867607763], [datetime.datetime(2015, 4, 26, 13, 20), 0.17091634151665247], [datetime.datetime(2015, 4, 26, 13, 40), 0.7653484731068122], [datetime.datetime(2015, 4, 26, 14, 0), 0.9510280942218504], [datetime.datetime(2015, 4, 26, 14, 20), 0.2696780695726898], [datetime.datetime(2015, 4, 26, 14, 40), 0.6634142333370054], [datetime.datetime(2015, 4, 26, 15, 0), 0.48395825825107863], [datetime.datetime(2015, 4, 26, 15, 20), 0.7669839652095866], [datetime.datetime(2015, 4, 26, 15, 40), 0.9479268674677883], [datetime.datetime(2015, 4, 26, 16, 0), 0.9046641495205922], [datetime.datetime(2015, 4, 26, 16, 20), 0.045289391652820865], [datetime.datetime(2015, 4, 26, 16, 40), 0.7932951067126703], [datetime.datetime(2015, 4, 26, 17, 0), 0.4419846953059643], [datetime.datetime(2015, 4, 26, 17, 20), 0.11146542138230242], [datetime.datetime(2015, 4, 26, 17, 40), 0.5887496294547572], [datetime.datetime(2015, 4, 26, 18, 0), 0.08733136331114111], [datetime.datetime(2015, 4, 26, 18, 20), 0.7957160332912587], [datetime.datetime(2015, 4, 26, 18, 40), 0.8128833057460692], [datetime.datetime(2015, 4, 26, 19, 0), 0.21977323027233342], [datetime.datetime(2015, 4, 26, 19, 20), 0.20504702851137402], [datetime.datetime(2015, 4, 26, 19, 40), 0.6555892081746738], [datetime.datetime(2015, 4, 26, 20, 0), 0.7380315441194354], [datetime.datetime(2015, 4, 26, 20, 20), 0.8075383278433004], [datetime.datetime(2015, 4, 26, 20, 40), 0.837007721004194], [datetime.datetime(2015, 4, 26, 21, 0), 0.8842141478652727], [datetime.datetime(2015, 4, 26, 21, 20), 0.3349342531521037], [datetime.datetime(2015, 4, 26, 21, 40), 0.811383235093619], [datetime.datetime(2015, 4, 26, 22, 0), 0.8273356582091318], [datetime.datetime(2015, 4, 26, 22, 20), 0.17269590855559502], [datetime.datetime(2015, 4, 26, 22, 40), 0.13561711047456493], [datetime.datetime(2015, 4, 26, 23, 0), 0.8906156794457442], [datetime.datetime(2015, 4, 26, 23, 20), 0.2653437814631542]]

(我用第二个元素的随机数据复制它)你可以按小时放入桶:

>>> buckets={}
>>> for t in arr: 
...    buckets.setdefault(t[0].hour, []).append(t)

然后对键进行排序,并使用第二个元组元素作为键获得最小值,最大值:

>>> for hour in sorted(buckets):
...    print hour, max(buckets[hour], key=lambda l: l[1]), min(buckets[hour], key=lambda l: l[1])

0 [datetime.datetime(2015, 4, 26, 0, 0), 0.9627101684867109] [datetime.datetime(2015, 4, 26, 0, 40), 0.1920554638586589]
1 [datetime.datetime(2015, 4, 26, 1, 20), 0.9870880292994234] [datetime.datetime(2015, 4, 26, 1, 0), 0.24394390686092926]
2 [datetime.datetime(2015, 4, 26, 2, 20), 0.6211085118418351] [datetime.datetime(2015, 4, 26, 2, 40), 0.1309246438480619]
3 [datetime.datetime(2015, 4, 26, 3, 40), 0.9260473796075621] [datetime.datetime(2015, 4, 26, 3, 0), 0.2042948575387714]
4 [datetime.datetime(2015, 4, 26, 4, 20), 0.9909948477818202] [datetime.datetime(2015, 4, 26, 4, 0), 0.08180604335801178]
5 [datetime.datetime(2015, 4, 26, 5, 40), 0.8643323785674638] [datetime.datetime(2015, 4, 26, 5, 20), 0.5751211758007434]
6 [datetime.datetime(2015, 4, 26, 6, 40), 0.9816449110831348] [datetime.datetime(2015, 4, 26, 6, 0), 0.44366887986412196]
7 [datetime.datetime(2015, 4, 26, 7, 0), 0.7976769524401801] [datetime.datetime(2015, 4, 26, 7, 20), 0.019715644725192494]
8 [datetime.datetime(2015, 4, 26, 8, 20), 0.9854650056341737] [datetime.datetime(2015, 4, 26, 8, 40), 0.44764478642480565]
9 [datetime.datetime(2015, 4, 26, 9, 40), 0.7652296383460859] [datetime.datetime(2015, 4, 26, 9, 20), 0.2428205990660569]
10 [datetime.datetime(2015, 4, 26, 10, 40), 0.7867821039231312] [datetime.datetime(2015, 4, 26, 10, 20), 0.5437523646936837]
11 [datetime.datetime(2015, 4, 26, 11, 0), 0.7178834338473005] [datetime.datetime(2015, 4, 26, 11, 40), 0.2819549901100772]
12 [datetime.datetime(2015, 4, 26, 12, 40), 0.8353818765863841] [datetime.datetime(2015, 4, 26, 12, 0), 0.0849398640248602]
13 [datetime.datetime(2015, 4, 26, 13, 40), 0.7653484731068122] [datetime.datetime(2015, 4, 26, 13, 20), 0.17091634151665247]
14 [datetime.datetime(2015, 4, 26, 14, 0), 0.9510280942218504] [datetime.datetime(2015, 4, 26, 14, 20), 0.2696780695726898]
15 [datetime.datetime(2015, 4, 26, 15, 40), 0.9479268674677883] [datetime.datetime(2015, 4, 26, 15, 0), 0.48395825825107863]
16 [datetime.datetime(2015, 4, 26, 16, 0), 0.9046641495205922] [datetime.datetime(2015, 4, 26, 16, 20), 0.045289391652820865]
17 [datetime.datetime(2015, 4, 26, 17, 40), 0.5887496294547572] [datetime.datetime(2015, 4, 26, 17, 20), 0.11146542138230242]
18 [datetime.datetime(2015, 4, 26, 18, 40), 0.8128833057460692] [datetime.datetime(2015, 4, 26, 18, 0), 0.08733136331114111]
19 [datetime.datetime(2015, 4, 26, 19, 40), 0.6555892081746738] [datetime.datetime(2015, 4, 26, 19, 20), 0.20504702851137402]
20 [datetime.datetime(2015, 4, 26, 20, 40), 0.837007721004194] [datetime.datetime(2015, 4, 26, 20, 0), 0.7380315441194354]
21 [datetime.datetime(2015, 4, 26, 21, 0), 0.8842141478652727] [datetime.datetime(2015, 4, 26, 21, 20), 0.3349342531521037]
22 [datetime.datetime(2015, 4, 26, 22, 0), 0.8273356582091318] [datetime.datetime(2015, 4, 26, 22, 40), 0.13561711047456493]
23 [datetime.datetime(2015, 4, 26, 23, 0), 0.8906156794457442] [datetime.datetime(2015, 4, 26, 23, 20), 0.2653437814631542]

如果您的数据已由datetime元素按顺序排列,则可以绕过单独的存储桶步骤并使用groupby

>>> from itertools import groupby
>>> for hour, group in groupby(arr, lambda t: t[0].hour):
...     li=list(group)
...     print hour, max(li, key=lambda l: l[1]), min(li, key=lambda l: l[1])

答案 3 :(得分:0)

您可以使用pandas。

import pandas as pd
  1. 创建DataFrame并及时排序

    df = pd.DataFrame(d, columns = ['time', 'price']).sort('time')
    
  2. 其中d是您输入的元组列表。

                     time  price
    0 2015-04-26 00:10:00   25.2
    1 2015-04-26 00:20:00   25.1
    2 2015-04-26 00:30:00   25.7
    3 2015-04-26 00:40:00   23.2
    4 2015-04-26 00:50:00   22.2
    5 2015-04-26 00:59:00   29.2
    6 2015-04-26 01:00:00   22.2
    7 2015-04-26 01:10:00   21.2
    
    1. 创建包含日期和小时信息的列

      df['day_hour'] = df.apply(lambda r: datetime.datetime(r['time'].year,    r['time'].month, r['time'].day, r['time'].hour,0), axis = 1)
      
    2.                  time  price            day_hour
      0 2015-04-26 00:10:00   25.2 2015-04-26 00:00:00
      1 2015-04-26 00:20:00   25.1 2015-04-26 00:00:00
      2 2015-04-26 00:30:00   25.7 2015-04-26 00:00:00
      3 2015-04-26 00:40:00   23.2 2015-04-26 00:00:00
      4 2015-04-26 00:50:00   22.2 2015-04-26 00:00:00
      5 2015-04-26 00:59:00   29.2 2015-04-26 00:00:00
      6 2015-04-26 01:00:00   22.2 2015-04-26 01:00:00
      7 2015-04-26 01:10:00   21.2 2015-04-26 01:00:00
      
      1. 删除原始“时间”列,因为它未在输出中使用

        df = df.drop('time', axis = 1)
        
      2. 按日期和小时对数据进行分组

        dfgrouped = df.groupby('day_hour')
        
      3. 获取每个date_hour的最大/最小值

        dfmax = dfgrouped.max()
        dfmin = dfgrouped.min()
        
      4. 在同一天_hour

        加入最大/分钟
        dfout = dfmax.join(dfmin, lsuffix='_max', rsuffix='_min')
        
      5. >>> dfout
                             price_max  price_min
        day_hour                                 
        2015-04-26 00:00:00       29.2       22.2
        2015-04-26 01:00:00       22.2       21.2