使用pandas和python创建嵌套字典或集合计数器

时间:2018-04-23 16:13:55

标签: python pandas dictionary counter

我想通过分组

在python中创建嵌套字典或集合
seriesA = ["groupA", "groupA", "groupB", "groupB", "groupC"]
seriesB = ["item1", "item1," "item3", "item1", "item2"]

期望的输出:

{ 'groupA': {'item1': 2},
  'groupB': {'item3': 1}, {'item1':1},
  'groupC': {'item2': 1}}

在Python中,是否有更简单的方法或者我会遍历列出的元组,并添加一个集合计数器?

nested_dict["groupA"]["item1"] 

...应该返回2次。

3 个答案:

答案 0 :(得分:4)

我使用collections.defaultdictcollections.Counter

from collections import defaultdict, Counter
from pprint import pprint

seriesA = ["groupA", "groupA", "groupB", "groupB", "groupC"]
seriesB = ["item1", "item1", "item3", "item1", "item2"]

nested_dict = defaultdict(Counter)

for a,b in zip(seriesA, seriesB):
    nested_dict[a][b] += 1

assert nested_dict["groupA"]["item1"] == 2

答案 1 :(得分:2)

我认为格式应为for(i in 1:365){ nrow = nrow(rte_m[[i]]); ncol = ncol(rte_m[[i]]); A <- as.matrix(rte_m[[i]]); sigma_x <- as.vector(sample.int(10, nrow(kf_vect[[i]]), replace=TRUE)) sigma_y <- as.vector(eps_vect[[i]]) yH <- as.vector(dh_vect[[i]]); yT <- yH + as.vector(eps_vect[[i]]); epsilon <- sample.int(10, nrow(kf_vect[[i]]), replace=TRUE) x <- as.vector(as.matrix(rte_m[[i]])%*%yT) + epsilon iterations = 500; #input data into a list called stan_data stan_data = list(nrow = nrow, ncol = ncol, yH = yH, x = x, epsilon = epsilon, A = A, sigma_x = sigma_x, sigma_y = sigma_y); #input it into our Stan model file "stamodeling.stan" stanmodel1 <- stan_model(file = "stamodeling.stan", model_name = "stanmodel1"); #MCMC sampling stanfit <- sampling(stanmodel1, data = list(ncol = ncol,nrow = nrow, yH = yH, x=x, epsilon = epsilon, A = A, sigma_x = sigma_x, sigma_y = sigma_y) ,iter=iterations, warmup = 200, chains = 4, cores = 2);

{key:[{ke1:va1,key2:val2}]}

答案 2 :(得分:2)

您指定的所需输出不是有效字典。但是,您可以使用itertools.groupbycollections.Counter获得类似的有效结果:

seriesA = ["groupA", "groupA", "groupB", "groupB", "groupC"]
seriesB = ["item1", "item1", "item3", "item1", "item2"]

from itertools import groupby
from collections import Counter

myCounts = {k: Counter(map(lambda g: g[1], group)) 
 for k, group in groupby(sorted(zip(seriesA, seriesB)), key=lambda x: x[0])}

print(myCounts)
#{'groupA': Counter({'item1': 2}),
# 'groupB': Counter({'item1': 1, 'item3': 1}),
# 'groupC': Counter({'item2': 1})}

如果您不想在字典中添加Counter,可以使用以下字符进行转换:

print({k: dict(v) for k, v in myCounts.items()})
#{'groupA': {'item1': 2},
# 'groupB': {'item1': 1, 'item3': 1},
# 'groupC': {'item2': 1}}