python根据日期时间对较早的值进行排序和选择

时间:2019-06-24 11:50:40

标签: python json amazon-web-services

我正在努力在我的AWS账户中查找旧的重复运行实例。

到目前为止,在列出重复的实例的地方,我可以使用以下JSON数据。

[
    {
        "InstanceName": "example-instance-0",
        "InstanceId": "i-0966108",
        "InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
    },
    {
        "InstanceName": "example-instance-1",
        "InstanceId": "i-0d83ecc",
        "InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
    },
    {
        "InstanceName": "example-instance-0",
        "InstanceId": "i-0268215",
        "InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
    },
    {
        "InstanceName": "example-instance-1",
        "InstanceId": "i-0a9b614",
        "InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
    }
]

在这里,我想按日期和时间排除最新实例,并打印所有其他实例。

我可以使用pandas数据框来做到这一点。但是如果不使用熊猫就无法弄清楚。有什么办法可以做到这一点?

我正在寻找的输出:

example-instance-0,i-0268215,2019-04-19,14:25:11
example-instance-1,i-0a9b614,2019-06-19,21:57:50

很抱歉,我仍然是python的初学者,正在寻求帮助。谢谢。

3 个答案:

答案 0 :(得分:1)

这是应该做的技巧的代码,请注意,我没有将输出格式化为字符串,您正在查看可以格式化并获取所需方式的输出列表

inputs = [
{
    "InstanceName": "example-instance-0",
    "InstanceId": "i-0966108",
    "InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
},
{
    "InstanceName": "example-instance-1",
    "InstanceId": "i-0d83ecc",
    "InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
},
{
    "InstanceName": "example-instance-0",
    "InstanceId": "i-0268215",
    "InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
},
{
    "InstanceName": "example-instance-1",
    "InstanceId": "i-0a9b614",
    "InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
}
]

outputs = []
keys = []
for ip in reversed(inputs) :
    if ip["InstanceName"] not in keys :
        outputs.append([ip["InstanceName"], ip["InstanceId"],     ip["InstanceLaunchTime"]])
        keys.append(ip["InstanceName"])
print (outputs)

您将获得输出

>>> [['example-instance-1', 'i-0a9b614', '2019-06-19 21:57:50+00:00'], ['example-instance-0', 'i-0268215', '2019-04-19 14:25:11+00:00']]

答案 1 :(得分:1)

使用itertools.groupby

例如:

from itertools import groupby

data = [{'InstanceId': 'i-0966108', 'InstanceName': 'example-instance-0', 'InstanceLaunchTime': '2019-06-20 19:10:50+00:00'}, {'InstanceId': 'i-0d83ecc', 'InstanceName': 'example-instance-1', 'InstanceLaunchTime': '2019-06-20 22:27:10+00:00'}, {'InstanceId': 'i-0268215', 'InstanceName': 'example-instance-0', 'InstanceLaunchTime': '2019-04-19 14:25:11+00:00'}, {'InstanceId': 'i-0a9b614', 'InstanceName': 'example-instance-1', 'InstanceLaunchTime': '2019-06-19 21:57:50+00:00'}]
result = []
for _, v in groupby(sorted(data, key=lambda x: (x["InstanceName"],x["InstanceLaunchTime"])), lambda x: x["InstanceName"]):
    result.extend(list(v)[-1:])  #Exclude latest item
pprint(result)

输出:

[{'InstanceId': 'i-0268215',
  'InstanceLaunchTime': '2019-04-19 14:25:11+00:00',
  'InstanceName': 'example-instance-0'},
 {'InstanceId': 'i-0a9b614',
  'InstanceLaunchTime': '2019-06-19 21:57:50+00:00',
  'InstanceName': 'example-instance-1'}]

答案 2 :(得分:0)

首先根据timestamps进行排序,然后删除重复项。请尝试以下代码。

lst1=[
    {
        "InstanceName": "example-instance-0",
        "InstanceId": "i-0966108",
        "InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
    },
    {
        "InstanceName": "example-instance-1",
        "InstanceId": "i-0d83ecc",
        "InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
    },
    {
        "InstanceName": "example-instance-0",
        "InstanceId": "i-0268215",
        "InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
    },
    {
        "InstanceName": "example-instance-1",
        "InstanceId": "i-0a9b614",
        "InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
    }
]
#Sort here based on timestamps using lamda function
lst1=sorted(lst1, key = lambda i: i['InstanceLaunchTime'].split(' ')[1])
res_list = []

seen = set()
res_list = []
for d in lst1:
    if d['InstanceName'] not in seen:
        seen.add(d['InstanceName']) 
        res_list.append(d)
print(res_list)