Question

我有一个字典，该字典的IP地址范围为键（在上一步中用于重复数据删除），某些对象为值。这是一个例子

词典sresult的一部分：

10.102.152.64-10.102.152.95 object1:object3
10.102.158.0-10.102.158.255 object2:object5:object4
10.102.158.0-10.102.158.31  object3:object4
10.102.159.0-10.102.255.255 object6

有成千上万的行，我想按键中的IP地址（正确）进行排序

我尝试根据范围分隔符-拆分密钥，以获得一个可以按如下方式排序的IP地址：

ips={}
for key in sresult:
    if '-' in key:
        l = key.split('-')[0]
        ips[l] = key
    else:
        ips[1] = key

然后使用在另一篇文章中找到的代码，按IP地址排序，然后在原始字典中查找值：

sips = sorted(ipaddress.ip_address(line.strip()) for line in ips)
for x in sips:
    print("SRC: "+ips[str(x)], "OBJECT: "+" :".join(list(set(sresult[ips[str(x)]]))), sep=",")

我遇到的问题是，当我分割原始范围并在另一个字典中将排序后的第一个IP添加为新关键字时，我再次对重复的数据进行了重复数据删除-示例中的第2行和第3行

 line 1 10.102.152.64 -10.102.152.95
 line 2 10.102.158.0  -10.102.158.255
 line 3 10.102.158.0  -10.102.158.31
 line 4 10.102.159.0  -10.102.255.25

成为

line 1 10.102.152.64 -10.102.152.95
line 3 10.102.158.0  -10.102.158.31
line 4 10.102.159.0  -10.102.255.25

因此，使用IP地址排序键重建原始词典时，我丢失了数据

有人可以帮忙吗？

Answer 1

编辑：该帖子现在包括三个部分：

1）您需要一些有关字典的信息，以了解其余内容。 2）分析您的代码，以及如何在不使用任何其他Python功能的情况下修复它。 3）详细介绍我认为最适合该问题的解决方案。

1）字典

Python字典未排序。如果我有这样的字典：

dictionary = {"one": 1, "two": 2}

然后遍历dictionary.items（），我可以先得到“一个”：1，或者我可以先得到“两个”：2。我不知道。

每个Python字典都隐式地有两个与其关联的列表：它的键的列表和其值的列表。您可以让他们列出以下内容：

print(list(dictionary.keys()))
print(list(dictionary.values()))

这些列表确实有顺序。这样就可以对其进行排序。当然，这样做不会更改原始词典。

您的代码

您意识到，在您的情况下，您只想根据字典密钥中的第一个IP地址进行排序。因此，您采用的策略大致如下：

1）建立一个新的字典，其中的键只是这第一部分。 2）从字典中获取键列表。 3）对键列表进行排序。 4）查询原始字典中的值。

正如您所注意到的，这种方法将在步骤1失败。因为一旦您用截断的键制作了新字典，您就将失去区分最后一些不同键的能力。每个字典键都必须是唯一的。

更好的策略是：

1）构建一个功能，可以表示您将“完整”的IP地址用作ip_address对象。

2）对字典键列表进行排序（原始字典，不要新建一个）。

3）按顺序查询字典。

让我们看看我们如何更改代码以实现步骤1。

def represent(full_ip):
    if '-' in full_ip:
        # Stylistic note, never use o or l as variable names.
        # They look just like 0 and 1.
        first_part = full_ip.split('-')[0]
        return ipaddress.ip_address(first_part.strip())

现在，我们有了一种表示完整IP地址的方法，我们可以根据此缩短的版本对它们进行排序，而不必完全更改密钥。我们要做的就是使用key参数告诉Python的sorted方法我们希望如何表示键（注意，此key参数与字典中的key无关。它们都恰好被称为key。）：

# Another stylistic note, always use .keys() when looping over dictionary keys. Explicit is better than implicit.

sips = sorted(sresults.keys(), key=represent)

如果此ipaddress库有效，那么到现在为止应该没有问题。您其余的代码可以直接使用。

第3部分最佳解决方案

无论何时您要处理排序，总是最容易想到一个简单得多的问题：给定两个项目，我将如何比较它们？ Python为我们提供了一种方法。我们要做的是实现两个名为

的数据模型方法。

__le__

和

__eq__

让我们尝试这样做：

class IPAddress:
    def __init__(self, ip_address):
        self.ip_address = ip_address # This will be the full IP address

    def __le__(self, other):
        """ Is this object less than or equal to the other one?"""
        # First, let's find the first parts of the ip addresses
        this_first_ip = self.ip_address.split("-")[0]
        other_first_ip = other.ip_address.split("-")[0]
        # Now let's put them into the external library
        this_object = ipaddress.ip_address(this_first_ip)
        other_object = ipaddress.ip_adress(other_first_ip)
        return this_object <= other_object

    def __eq__(self, other):
        """Are the two objects equal?"""
        return self.ip_address == other.ip_adress

很酷，我们有一堂课。现在，只要我使用“ <”或“ <=”或“ ==”，数据模型方法就会被自动调用。让我们检查一下它是否正常工作：

test_ip_1 = IPAddress("10.102.152.64-10.102.152.95")
test_ip_2 = IPAddress("10.102.158.0-10.102.158.255")

print(test_ip_1 <= test_ip_2)

现在，这些数据模型方法的优点在于，Python的“ sort”和“ sorted”也将使用它们：

dictionary_keys = sresult.keys()
dictionary_key_objects = [IPAddress(key) for key in dictionary_keys]
sorted_dictionary_key_objects = sorted(dictionary_key_objects)
# According to you latest comment, the line below is what you are missing
sorted_dictionary_keys = [object.ip_address for object in sorted_dictionary_key_objects]

现在您可以这样做：

for key in sorted_dictionary_keys:
    print(key)
    print(sresults[key])

Python数据模型几乎是Python的定义功能。我建议阅读一下。

Python：对作为字典键的IP范围进行排序

1 个答案: