我有一个如下的结果类
class Result:
cluster = -1;
label = -1;
聚类和标签的值都可以在0-9之间,我想做的是计算聚类中标签的数量。目前,我正在使用以下代码进行计数,但这并不是一个很好的解决方案。 resultList是Result对象的列表。
countZero = 0;
countOne = 0;
countTwo = 0;
countThree = 0;
countFour = 0;
countFive = 0;
countSix = 0;
countSeven = 0;
countEight = 0;
countNine = 0;
for i in range(len(resultList)):
if resultList[i].cluster == 0:
if resultList[i].label == 0:
countZero = countZero + 1
if resultList[i].label == 1:
countOne = countOne + 1
if resultList[i].label == 2:
countTwo = countTwo + 1
if resultList[i].label == 3:
countThree = countThree + 1
if resultList[i].label == 4:
countFour = countFour + 1
if resultList[i].label == 5:
countFive = countFive + 1
if resultList[i].label == 6:
countSix = countSix + 1
if resultList[i].label == 7:
countSeven = countSeven + 1
if resultList[i].label == 8:
countEight = countEight + 1
if resultList[i].label == 9:
countNine = countNine + 1
print(countZero) #
print(countOne) #
print(countTwo) #
print(countThree) #
print(countFour) #
print(countFive) #
print(countSix) #
print(countSeven) #
print(countEight) #
print(countNine) #
对于找到更好解决方案的任何建议或指导,将不胜感激。
答案 0 :(得分:2)
Counter函数返回一个字典,其中包含每个标签的计数。以此方式将其用于群集0:
from collections import Counter
Counter(resultList[resultList['cluster']==0]]['label'])
答案 1 :(得分:1)
这是数据结构的目的。在这里,您可以使用dict
在几行中完成所有操作:
counts = {i:0 for i in range(10)} # constructs a dict {1: 0} for each number 0-9
for i in range(len(resultList)):
if resultList[i].cluster == 0:
counts[resultList[i].label] += 1 # find the count corresponding to the number, and increment it
for k, v in counts:
print(f"Count {k}: {v}")
答案 2 :(得分:1)
counts = [0 for x in range(10)]
for i in range(len(resultList)):
if resultList[i].cluster == 0:
counts[resultList[i].label] += 1
答案 3 :(得分:1)
如果要获取集群中标签的特定计数,可以创建cluster_id
和label_id
的嵌套字典:
# Create empty dictionary
cluster_dict = {}
# For 0-9 cluster_id
for cluster_id in range(10):
# Create a dict for each cluster
if cluster_id not in cluster_dict.keys():
cluster_dict[cluster_id] = {}
# For 0-9 label_id
for label_id in range(10):
# Set the cluster/label count to 0
cluster_dict[cluster_id][label_id] = 0
然后,您可以使用result_list
值填充它:
for res in result_list:
cluster_dict[res.cluster][res.label] += 1
这样您就可以访问计数,因此对于集群0和标签2:
cluster_dict[0][2]
您还可以找到给定聚类的结果数,而与标签无关:
sum(cluster_dict[0].values())
您还可以找到给定标签的结果数,而与簇无关:
sum([count for cluster_id, label_counter in cluster_dict.items() for label_id, count in label_counter.items() if label_id == 2])
答案 4 :(得分:1)
更简单的方法
import random
class Result:
def __init__(self ,cluster , label):
self.label = label
self.cluster = cluster
# global counter
counter = {key:0 for key in range(1 , 10)}
# gen random for testing
lists = [Result(random.randint(0 , 1) , random.randint(1 , 9)) for r in range(1000)]
for result in lists:
counter[result.label] += 1 if result.cluster == 0 else 0
答案 5 :(得分:0)
与其他答案相似,但使用defaultdict
。本质上是单线的。
from collections import defaultdict
class Result:
def __init__(self, c, l):
self.cluster = c
self.label = l
counts = defaultdict(int)
resultList = [Result(1,9), Result(2,1), Result(3, 1), Result(1, 2), Result(1,9)]
for r in resultList:
counts[(r.cluster, r.label)] += 1
print(counts)
输出:
defaultdict(<class 'int'>, {(1, 9): 2, (2, 1): 1, (3, 1): 1, (1, 2): 1})