根据条件迭代列表和删除列表的python列表

时间:2017-03-20 16:57:49

标签: python list loops

我在python中有一个列表列表如下:

[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
 ['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
 ['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
 ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
 ['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
 ['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
 ['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
 ['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
 ['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
 ['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
 ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
 ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]

第1栏:epoch_time
 第2列:serial_number
 第3栏:域
 第4列:服务器

如何遍历每个域的列表列表,以便如果serial_number等于8.8.8.8的serial_number,则删除列表,以便最终输出如下:

['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]

5 个答案:

答案 0 :(得分:2)

这应该这样做:

a = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
 ['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
 ['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
 ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
 ['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
 ['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
 ['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
 ['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
 ['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
 ['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
 ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
 ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]

remove = [item[1] for item in a if item[3]=='8.8.8.8']
clean = [item for item in a if item[1] not in remove or item[3]=='8.8.8.8']
print clean

答案 1 :(得分:1)

你没有写任何代码,所以我也不会。

  • 创建一组禁止的serial_numbers
  • 在列表中迭代一次。
  • 如果serial_numbersip,请将序列号添加到8.8.8.8

  • 第二次迭代列表,列表理解。

  • 如果ip8.8.8.8serial_number不在,则保留元素 serial_numbers

编写和快速运行将会很短。

答案 2 :(得分:1)

我会对列表进行排序,以便在开头显示地址为8.8.8.8的行,然后我会遍历列表,在插入时标记密钥(序列,域),以确保只插入一次。< / p>

l = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
 ['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
 ['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
 ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
 ['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
 ['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
 ['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
 ['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
 ['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
 ['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
 ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
 ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]

inserted = set()
result = []
for row in sorted(l,key=lambda r: r[3]!="8.8.8.8"):
    timestamp,serial,domain,server = row
    k = (serial,domain)
    if k in inserted:
        pass  # already in result: skip
    else:
        result.append(row)
        inserted.add(k)

结果:

[['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'], ['1490019411.19', '150612899', 'google.com', '8.8.8.8'], ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8'], ['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'], ['1490026791.35', '150612898', 'google.com', '208.67.222.222'], ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4']]

答案 3 :(得分:1)

您可以获取与服务器关联的所有serial_number(8.8.8.8),然后在使用if条件形成列表时忽略它们!

data=[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
      ['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
      ['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
      ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
      ['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
      ['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
      ['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
      ['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
      ['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
      ['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
      ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
      ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]


serv='8.8.8.8'
fil=filter(None,map(lambda x: x[1] if x[3]==serv else None, data))
print [i for i in data if i[1] not in fil or i[3] == serv]

输出:

[['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'], ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'], ['1490026791.35', '150612898', 'google.com', '208.67.222.222'], ['1490019411.19', '150612899', 'google.com', '8.8.8.8'], ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'], ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]

如果你计算时间, 关于使用列表理解(少数其他解决方案),

7.9870223999e-05

使用lambda和map

4.81605529785e-05

在这种情况下这应该是一个问题,但是当数据集很大时,时间确实很重要。 希望它有所帮助!

答案 4 :(得分:1)

只需创建过滤器列表,然后对列表解析应用过滤:

>>> l = [['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220'],
 ['1490026791.59', '2010113820', 'amazon.com', '208.67.222.222'],
 ['1490026791.57', '2010113820', 'amazon.com', '8.8.4.4'],
 ['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8'],
 ['1490026791.37', '150612899', 'google.com', '208.67.220.220'],
 ['1490026791.35', '150612898', 'google.com', '208.67.222.222'],
 ['1490026791.33', '150612899', 'google.com', '8.8.4.4'],
 ['1490019411.19', '150612899', 'google.com', '8.8.8.8'],
 ['1490026791.57', '2017032001', 'intuit.com', '208.67.220.220'],
 ['1490026791.47', '2017032001', 'intuit.com', '208.67.222.222'],
 ['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4'],
 ['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']]
>>>
>>> ip_check = '8.8.8.8'
>>> filter_serials = [lst[1] for lst in l if lst[3] == ip_check]
>>> filter_serials
['2010113820', '150612899', '2017032001']
>>> 
>>> output_list = [lst for lst in l if lst[3] == ip_check or lst[1] not in filter_serials]
>>> 
>>> for lst in output_list:
    print(lst)


['1490011456.91', '2010113819', 'amazon.com', '208.67.220.220']
['1490026791.55', '2010113820', 'amazon.com', '8.8.8.8']
['1490026791.35', '150612898', 'google.com', '208.67.222.222']
['1490019411.19', '150612899', 'google.com', '8.8.8.8']
['1490026791.45', '2017032000', 'intuit.com', '8.8.4.4']
['1490026791.43', '2017032001', 'intuit.com', '8.8.8.8']