我正在尝试制作一个非常简单的计数脚本,我想使用defaultdict(我无法理解如何使用DefaultDict,所以如果有人可以评论我的代码snippit我会非常感激它)
我的目标是获取元素0和元素1,将它们合并为一个字符串,然后计算有多少个唯一字符串......
例如,在下面的数据中有15行由3个类组成,4个classid在合并在一起时我们只有3个唯一的类。第一行的合并数据(忽略标题行)为:Class01CD2
uniq1,uniq2,three,four,five,six
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
DClass2,DE2,data,data,data,data
DClass2,DE2,data,data,data,data
Class02,CD1,data,data,data,data
Class02,CD1,data,data,data,data
它的想法是简单地打印出可用的唯一类的数量。 有人能帮我解决这个问题吗?
问候
- Hyflex
答案 0 :(得分:1)
由于您正在处理CSV数据,因此您可以将CSV模块与词典结合使用:
import csv
uniq = {} #Create an empty dictionary, which we will use as a hashmap as Python dictionaries support key-value pairs.
ifile = open('data.csv', 'r') #whatever your CSV file is named.
reader = csv.reader(ifile)
for row in reader:
joined = row[0] + row[1] #The joined string is simply the first and second columns in each row.
#Check to see that the key exists, if it does increment the occurrence by 1
if joined in uniq.keys():
uniq[joined] += 1
else:
uniq[joined] = 1 #This means the key doesn't exist, so add the key to the dictionary with an occurrence of 1
print uniq #Now output the results
输出:
{'Class02CD3': 7, 'Class02CD1': 2, 'Class01CD2': 3, 'DClass2DE2': 2}
注意:这假设CSV没有标题行(uniq1,uniq2,three,four,five,six
)。
参考文献: