我在python中有一个列表列表如下:
[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
第1栏:epoch_time
第2列:serial_number
第3栏:域
第4列:服务器
如何遍历每个域的列表列表,以便如果serial_number等于8.8.8.8的serial_number,则删除列表,以便最终输出如下:
['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
答案 0 :(得分:2)
这应该这样做:
a = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
remove = [item[1] for item in a if item[3]=='8.8.8.8']
clean = [item for item in a if item[1] not in remove or item[3]=='8.8.8.8']
print clean
答案 1 :(得分:1)
你没有写任何代码,所以我也不会。
serial_numbers
如果serial_numbers
为ip
,请将序列号添加到8.8.8.8
。
第二次迭代列表,列表理解。
ip
为8.8.8.8
或serial_number
不在,则保留元素
serial_numbers
。编写和快速运行将会很短。
答案 2 :(得分:1)
我会对列表进行排序,以便在开头显示地址为8.8.8.8
的行,然后我会遍历列表,在插入时标记密钥(序列,域),以确保只插入一次。< / p>
l = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
inserted = set()
result = []
for row in sorted(l,key=lambda r: r[3]!="8.8.8.8"):
timestamp,serial,domain,server = row
k = (serial,domain)
if k in inserted:
pass # already in result: skip
else:
result.append(row)
inserted.add(k)
结果:
[['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'], ['1490019411.19', '150612899', 'google.com', '8.8.8.8'], ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8'], ['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'], ['1490026791.35', '150612898', 'google.com', '208.67.222.222'], ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4']]
答案 3 :(得分:1)
您可以获取与服务器关联的所有serial_number(8.8.8.8),然后在使用if条件形成列表时忽略它们!
data=[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
serv='8.8.8.8'
fil=filter(None,map(lambda x: x[1] if x[3]==serv else None, data))
print [i for i in data if i[1] not in fil or i[3] == serv]
输出:
[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'], ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'], ['1490026791.35', '150612898', 'google.com', '208.67.222.222'], ['1490019411.19', '150612899', 'google.com', '8.8.8.8'], ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'], ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
如果你计算时间, 关于使用列表理解(少数其他解决方案),
7.9870223999e-05
使用lambda和map
4.81605529785e-05
在这种情况下这应该是一个问题,但是当数据集很大时,时间确实很重要。 希望它有所帮助!
答案 4 :(得分:1)
只需创建过滤器列表,然后对列表解析应用过滤:
>>> l = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
>>>
>>> ip_check = '8.8.8.8'
>>> filter_serials = [lst[1] for lst in l if lst[3] == ip_check]
>>> filter_serials
['2010113820', '150612899', '2017032001']
>>>
>>> output_list = [lst for lst in l if lst[3] == ip_check or lst[1] not in filter_serials]
>>>
>>> for lst in output_list:
print(lst)
['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220']
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8']
['1490026791.35', '150612898', 'google.com', '208.67.222.222']
['1490019411.19', '150612899', 'google.com', '8.8.8.8']
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4']
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']