Python - 按时间戳合并数据

时间:2017-08-23 09:20:56

标签: python

所以我的源数据如下所示:

[{'value': 10.0001, 'epoch_ms': 1488355514015, 'vin': 'a1', 'name': 'VMax'},
{'value': 5.0002, 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMin'},
{'value': 11.0002 , 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMax'},]

我的目标数据是这样的:

{'timestamp': 1488355514, 'max': 10.0001, 'vin': 'a1', 'min': 'null'}
{'timestamp': 1488356113, 'max': '11.0002', 'vin': 'a1', 'min': 5.0002}

我现在的代码是什么:

import json

source = [
    {'value': 10.0001, 'epoch_ms': 1488355514015, 'vin': 'a1', 'name': 'VMax'},
    {'value': 5.0002, 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMin'},
    {'value': 11.0002 , 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMax'},
]

target = []
for obj in source:
    target.append({
        'vin':obj['vin'],
        'timestamp': int(obj['epoch_ms']/1000),
        'min': obj['value'] if obj['name'] == 'VMin' else '',
        'max': obj['value'] if obj['name'] == 'VMax' else '',
    })

for obj in target:
    print(obj)

我当前代码的当前输出如下所示,但它没有合并相同的时间戳(下面的示例中为1488356113),如何通过相同的时间戳将它们合并为一个?直到它与我的目标数据格式匹配?

{'timestamp': 1488355514, 'max': 10.0001, 'vin': 'a1', 'min': ''}
{'timestamp': 1488356113, 'max': 'null', 'vin': 'a1', 'min': 5.0002}
{'timestamp': 1488356113, 'max': 11.0002, 'vin': 'a1', 'min': ''}

请建议,谢谢!

3 个答案:

答案 0 :(得分:0)

您当前的代码会将每个输入行放入目标。以下将合并

import json

source = sorted([
    {'value': 10.0001, 'epoch_ms': 1488355514015, 'vin': 'a1', 'name': 'VMax'},
    {'value': 5.0002, 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMin'},
    {'value': 11.0002 , 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMax'}
], key=lambda x: x['epoch_ms'])

target = []
prev_tstamp = 0
for obj in source:  
   tstamp = int(obj['epoch_ms']/1000)
   if tstamp == prev_tstamp :
       if obj['name'] == 'VMin':
          target[-1]['min'] = obj['value']
       else:
          target[-1]['max'] = obj['value']
   else :
       target.append({
        'vin':obj['vin'],
        'timestamp': tstamp,
        'min': obj['value'] if obj['name'] == 'VMin' else '',
        'max': obj['value'] if obj['name'] == 'VMax' else '',
        })
   prev_tstamp = tstamp

for obj in target:
    print(obj)

答案 1 :(得分:0)

由于时间戳是您的密钥,您可以先构建一个字典,然后将其转换为一个列表:

src = [
    {'value': 10.0001, 'epoch_ms': 1488355514015, 'vin': 'a1', 'name': 'VMax'},
    {'value': 5.0002, 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMin'},
    {'value': 11.0002 , 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMax'}]


def group_min_max(dictionaries):
    """Groups minimum and maximum value dictionaries by time stamps"""

    result = {}

    for dictionary in dictionaries:
        timestamp = dictionary['epoch_ms']

        try:
            result_dictionary = result[timestamp]
        except KeyError:
            result[timestamp] = {
                'timestamp': timestamp,
                'max': dictionary['value'] if dictionary['name'] == 'VMax' else 'null',
                'vin': dictionary['vin'],
                'min': dictionary['value'] if dictionary['name'] == 'VMin' else 'null'}
        else:
            if dictionary['name'] == 'VMin':
                result_dictionary['min'] = dictionary['value']
            elif dictionary['name'] == 'VMax':
                result_dictionary['max'] = dictionary['value']

    return result


target = [value for _, value in group_min_max(src).items()]

print(target)

答案 2 :(得分:0)

我想有很多方法可以做到这一点,这是一种使用pandas的方法:

import pandas as pd
from collections import defaultdict

lst =[{'value': 10.0001, 'epoch_ms': 1488355514015, 'vin': 'a1', 'name': 'VMax'},
{'value': 5.0002, 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMin'},
{'value': 11.0002 , 'epoch_ms': 1488356113504, 'vin': 'a1', 'name': 'VMax'},]

df = pd.DataFrame(lst)
gb = df.groupby("epoch_ms")

d = defaultdict(list)
for item in gb:
    for v in item[1].to_dict("index").values():
        key = int(v.pop("epoch_ms")/1000)
        d[key].append(v)

d

返回列表中包含keys = timestamps和values = dictionaries的字典。

defaultdict(list,
            {1488355514: [{'name': 'VMax', 'value': 10.0001, 'vin': 'a1'}],
             1488356113: [{'name': 'VMin', 'value': 5.0002, 'vin': 'a1'},
              {'name': 'VMax', 'value': 11.0002, 'vin': 'a1'}]})