Question

我有一个API列表及其相应的执行时间，格式如下：

findByNameWithProduct, 108
findProductByPartNumber, 140
findProductById, 178
findByName, 99
findProductsByCategory, 260
findByNameWithCategory, 103
findByNameWithCategory, 108
findByNameWithCategory, 99
findByNameWithProduct, 20
findProductById, 134
findTopCategories, 54
findByName, 48
findSubCategories, 44
findProductByPartNumber, 70
findProductByPartNumber, 63

我要做的是针对每个唯一的API，存储最小，最大，平均和第90百分位执行时间但不确定如何执行此操作。我已经考虑过使用字典，我可以检测是否已经输入了API但据我所知，字典只是一个名称值对，而不是多个条目。我一直在玩这样的东西，但我知道它没有效率（加上它没有用）。我对Python中的数据结构不太熟悉 - 有没有人知道一种干净的方法来实现这个目标？

counter = 0
uniqueAPINames = set(apiNames)
for uniqueAPIName in uniqueAPINames :
    for line in lines:
        if uniqueAPIName in line:
            print line
                    #Somehow add all these up...
    counter = counter + 1

编辑：

在接受答案的帮助下，这是解决方案：

tests = []
lines = []
files = [f for f in os.listdir(folder)]
for f in files:
    if '-data.log' in f:
        fi = open(folder + '/' + f, 'r')
        lines = lines + fi.readlines()
        fi.close()
        for line in lines:
            if ('Thread' not in line):
                lineSplit = line.strip().split(',')
                testNumber = lineSplit[2].strip()
                testName = apiData[testNumber]
                testTime = lineSplit[4].strip()
                testList = [testName, testTime]
                tests.append(testList)

d = {}
for test in tests:
    if test[0] in d:
        d[test[0]].append(test[1])
    else:
        d[test[0]] = [test[1]]

for key in d:
    print 'API Name: ' + str(key)
    d[key] = [int(i) for i in d[key]]
    d[key].sort(key=int)
    print 'Min: ' + str(d[key][0])
    print 'Max: ' + str(d[key][-1])
    print 'Average: ' + str(sum(d[key]) / len(d[key]))
    print '90th Percentile: ' + str(sum(d[key]) / len(d[key]) * .90)

Answer 1

你在字典的正确轨道上。值可以是任何值，在这种情况下，列表是有意义的：

d = {}
for api_name, runtime in whatever:
    if api_name in d:  # we've seen it before
        d[api_name].append(runtime)
    else:  # first time
        d[api_name] = [runtime]  # list with one entry

现在您有一个dict将API名称映射到所有运行时的列表。其余的很清楚？我会对每个列表进行排序，然后找到min，max和百分位数都很容易。

for runtimes in d.itervalues():
    runtimes.sort()

就可以就所有dict的运行时列表进行排序。

从CSV生成一组API计时

1 个答案: