Question

我是一个Python新手试图解析一个文件来制作一个内存分配表。我的输入文件格式如下：

48 bytes allocated at 0x8bb970a0
24 bytes allocated at 0x8bb950c0
48 bytes allocated at 0x958bd0e0
48 bytes allocated at 0x8bb9b060
96 bytes allocated at 0x8bb9afe0
24 bytes allocated at 0x8bb9af60

我的第一个目标是创建一个计算特定字节分配数的实例的表。换句话说，我对上述输入的所需输出将是：

48 bytes -> 3 times
96 bytes -> 1 times
24 bytes -> 2 times

（现在，我不关心内存地址）

由于我使用的是Python，我认为使用字典这样做是正确的方法（基于大约3个小时的阅读Python教程）。这是个好主意吗？

在尝试使用字典时，我决定将字节数设为'key'，将计数器设为'value'。我的计划是在每次出现钥匙时递增计数器。截至目前，我的代码段如下：

# Create an empty dictionary
allocationList = {}

# Open file for reading
with open("allocFile.txt") as fp: 
    for line in fp: 
        # Split the line into a list (using space as delimiter)
        lineList = line.split(" ")

        # Extract the number of bytes
        numBytes = lineList[0];

        # Store in a dictionary
        if allocationList.has_key('numBytes')
            currentCount = allocationList['numBytes']
            currentCount += 1
            allocationList['numBytes'] = currentCount
        else
            allocationList['numBytes'] = 1 

for bytes, count in allocationList.iteritems()
    print bytes, "bytes -> ", count, " times"

有了这个，我在'has_key'调用中得到一个语法错误，这让我质疑是否甚至可以将变量用作字典键。到目前为止，我看到的所有示例都假设密钥可以预先获得。在我的情况下，我只有在解析输入文件时才能获取密钥。

（请注意，我的输入文件可以包含数千行，包含数百个不同的键）

感谢您提供任何帮助。

Answer 1

学习语言与语法和基本类型一样，与标准库有关。 Python已经有了一个让你的任务非常简单的类：collections.Counter。

from collections import Counter

with open("allocFile.txt") as fp:
    counter = Counter(line.split()[0] for line in fp)

for bytes, count in counter.most_common():
    print bytes, "bytes -> ", count, " times"

Answer 2

字典的dict.has_key()方法有disappeared in python3，要替换它，请使用in关键字：

if numBytes in allocationList:    # do not use numBytes as a string, use the variable directly
    #do the stuff

但在你的情况下，你也可以替换所有的

if allocationList.has_key('numBytes')
            currentCount = allocationList['numBytes']
            currentCount += 1
            allocationList['numBytes'] = currentCount
        else
            allocationList['numBytes'] = 1

一行get：

allocationList[numBytes] = allocationList.get(numBytes, 0) + 1

Answer 3

您收到语法错误，因为您在此行的末尾缺少冒号：

if allocationList.has_key('numBytes')
                                     ^

您的方法很好，但使用默认值dict.get()可能更容易：

allocationList[numBytes] = allocationList.get(numBytes, 0) + 1

由于您的allocationList是字典而不是列表，因此您可能希望为变量选择其他名称。

Answer 4

你绝对可以使用变量作为dict键。但是，您有一个名为numBytes的变量，但使用的是包含文本"numBytes"的字符串 - 您使用的是字符串常量，而不是变量。这不会导致错误，但是有问题。相反，尝试：

if numBytes in allocationList:
    # do stuff

此外，请考虑Counter。这是一个方便的类，用于处理您正在查看的案例。

Python字典，变量作为键

4 个答案: