过滤重复的行不起作用

时间:2017-10-04 10:39:20

标签: python python-3.x

我尝试了多种去除最终输出的方法,但每次都是一样的。我不明白为什么,我的最终解决方案是在示例中:

#!/usr/bin/python3

import re
import sys
import json

## Variables
fileOfHosts = "/path/to/file"

## Define lists and dics
my_dic = {}
resultOut = []

## Functions:

# Filter duplicated lines in list
def f1(seq):
    newlist = []
    for i in seq:
        if i not in newlist:
            newlist.append(i)
    return newlist

# Json converter
def f2(inout):
    out = json.dumps(inout)
    return out

## Open file as list of lines
with open(fileOfHosts) as fileOfHosts:
    result = list(fileOfHosts)

## Parse lines and generate dictionary
for f in f1(result):
    sortByWord = re.findall(r"[\w+\.']+", f)
    listOfTwo = sortByWord[:2]
    if len(listOfTwo) == 2:
        my_dic[listOfTwo[0]] = listOfTwo[1]
        resultOut.append(my_dic.copy())

## Display list of dictoary as json
print(f2(resultOut))

我也尝试过最后过滤字典列表。但总是我有相同的重复行。

有人可以提供更好的解决方案来过滤重复项吗?

修改

首先,这不是评论中提到的问题的重复。在发布前我尝试了那里提到的解决方案。

实际上似乎问题不在于重复数据删除方法,而是在创建字典时进行复制。

代码:

for f in result:
    sortByWord = re.findall(r"[\w+\.']+", f)
    # print(sortByWord)
    listOfTwo = sortByWord[:2]
    # print(listOfTwo)
    if len(listOfTwo) == 2:
        print(listOfTwo)
        my_dic[listOfTwo[0]] = listOfTwo[1]
        resultOut.append(my_dic)

输出(print(listOfTwo)):

['define', 'host']
['host_name', 'HOST_name']
['alias', 'HOST_name']
['address', '127.0.0.1']
['register', '1']
['timezone', 'Europe']
['use', 'user']
['_SNMPCOMMUNITY', 'public']
['_SNMPVERSION', '3']
['_HOST_ID', '184']
['define', 'host']
['host_name', 'HOST_name']
['alias', 'HOST_name']
['address', '127.0.0.1']
['register', '1']
['timezone', 'Europe']
['use', 'user']
['_SNMPCOMMUNITY', 'public']
['_SNMPVERSION', '3']
['_HOST_ID', '185']

输出(印刷(F2(resultOut))):

[{"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}, {"_HOST_ID": "185", "address": "127.0.0.1", "_SNMPCOMMUNITY": "public", "timezone": "Europe", "define": "host", "host_name": "host", "_SNMPVERSION": "3", "alias": "host", "register": "1", "use": "user"}]

我不明白为什么。

1 个答案:

答案 0 :(得分:0)

我使用以下代码解决了我的问题:

for f in result:
    sortByWord = re.findall(r"[\w+\.']+", f)
    listOfTwo = sortByWord[:2]
    if len(listOfTwo) == 2:
        list1.append(listOfTwo[0])
        list2.append(listOfTwo[1])
    if is_empty(listOfTwo) == True:
        my_dic = { k:v for (k,v) in zip(list1, list2)}
        if is_empty(my_dic) == False:
            resultOut.append(my_dic)
        list1 = []
        list2 = []
        my_dic = {}