我正在努力在我的AWS账户中查找旧的重复运行实例。
到目前为止,在列出重复的实例的地方,我可以使用以下JSON数据。
[
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0966108",
"InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0d83ecc",
"InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
},
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0268215",
"InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0a9b614",
"InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
}
]
在这里,我想按日期和时间排除最新实例,并打印所有其他实例。
我可以使用pandas
数据框来做到这一点。但是如果不使用熊猫就无法弄清楚。有什么办法可以做到这一点?
我正在寻找的输出:
example-instance-0,i-0268215,2019-04-19,14:25:11
example-instance-1,i-0a9b614,2019-06-19,21:57:50
很抱歉,我仍然是python的初学者,正在寻求帮助。谢谢。
答案 0 :(得分:1)
这是应该做的技巧的代码,请注意,我没有将输出格式化为字符串,您正在查看可以格式化并获取所需方式的输出列表
inputs = [
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0966108",
"InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0d83ecc",
"InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
},
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0268215",
"InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0a9b614",
"InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
}
]
outputs = []
keys = []
for ip in reversed(inputs) :
if ip["InstanceName"] not in keys :
outputs.append([ip["InstanceName"], ip["InstanceId"], ip["InstanceLaunchTime"]])
keys.append(ip["InstanceName"])
print (outputs)
您将获得输出
>>> [['example-instance-1', 'i-0a9b614', '2019-06-19 21:57:50+00:00'], ['example-instance-0', 'i-0268215', '2019-04-19 14:25:11+00:00']]
答案 1 :(得分:1)
使用itertools.groupby
例如:
from itertools import groupby
data = [{'InstanceId': 'i-0966108', 'InstanceName': 'example-instance-0', 'InstanceLaunchTime': '2019-06-20 19:10:50+00:00'}, {'InstanceId': 'i-0d83ecc', 'InstanceName': 'example-instance-1', 'InstanceLaunchTime': '2019-06-20 22:27:10+00:00'}, {'InstanceId': 'i-0268215', 'InstanceName': 'example-instance-0', 'InstanceLaunchTime': '2019-04-19 14:25:11+00:00'}, {'InstanceId': 'i-0a9b614', 'InstanceName': 'example-instance-1', 'InstanceLaunchTime': '2019-06-19 21:57:50+00:00'}]
result = []
for _, v in groupby(sorted(data, key=lambda x: (x["InstanceName"],x["InstanceLaunchTime"])), lambda x: x["InstanceName"]):
result.extend(list(v)[-1:]) #Exclude latest item
pprint(result)
输出:
[{'InstanceId': 'i-0268215',
'InstanceLaunchTime': '2019-04-19 14:25:11+00:00',
'InstanceName': 'example-instance-0'},
{'InstanceId': 'i-0a9b614',
'InstanceLaunchTime': '2019-06-19 21:57:50+00:00',
'InstanceName': 'example-instance-1'}]
答案 2 :(得分:0)
首先根据timestamps
进行排序,然后删除重复项。请尝试以下代码。
lst1=[
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0966108",
"InstanceLaunchTime": "2019-06-20 19:10:50+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0d83ecc",
"InstanceLaunchTime": "2019-06-20 22:27:10+00:00"
},
{
"InstanceName": "example-instance-0",
"InstanceId": "i-0268215",
"InstanceLaunchTime": "2019-04-19 14:25:11+00:00"
},
{
"InstanceName": "example-instance-1",
"InstanceId": "i-0a9b614",
"InstanceLaunchTime": "2019-06-19 21:57:50+00:00"
}
]
#Sort here based on timestamps using lamda function
lst1=sorted(lst1, key = lambda i: i['InstanceLaunchTime'].split(' ')[1])
res_list = []
seen = set()
res_list = []
for d in lst1:
if d['InstanceName'] not in seen:
seen.add(d['InstanceName'])
res_list.append(d)
print(res_list)