Question

我想计算以下数据集上的员工人数

for combined in dic.iteritems():
    """prints employees by employer and year"""
    print(combined)


 ('a', {2001: 12, 2001: 13, 2001: 15, 2004: 28, 1999: 12})
    ('c', {2000: 23, 2003: 15, 2004: 7, 2005: 24})
    ('b', {2001: 13, 2002: 13, 2012: 12})
    ('e', {2002: 7, 2004: 30, 2005: 7})
    ('d', {2001: 7, 2002: 28, 2010: 24})
    ('g', {2000: 7, 2009: 7, 2010: 333})
    ('f', {2005: 30, 2006: 7, 1999: 12})


for employer, yearIndividuals in dic.iteritems():
    print(employer)
    """iterate over the dictionary to find the combinations"""
    for year, individuals in yearIndividuals.iteritems():
        #print(employer, individuals, year)
        x=employer, individuals, year
        for grp, elemts in groupby(x, (lambda x: x[1], x[0])):

            print(grp, len(list(elmts)))

我希望以以下格式输出：

employer, year, employee
a, 2001, 3
a, 2004, 1
a, 1999, 1
c, 2000, 1
c, 2003, 1
c, 2004, 1
c, 2005, 1

这就是我所需要的：我正在尝试计算人们调换工作的可能性。 x可能在第一年为z公司工作，然后在第二年切换为a公司。
我正在尝试找出这种过渡是如何发生的。
假设表格有三列-employer，employee和year。
在我上面的示例中，字母a表示雇主，而数字12等表示雇员。

我该怎么做？

通常，我的要求是使雇主与个人匹配并计数

Answer 1

字典中不能有相同的键（年份）。不过，您确实可以列出特定年份的雇员列表，例如2001: [12, 13, 15]。那么列表很简单

for employer in dic:
    for year in dic[employer]:
        for employee in dic[employer][year]:
            print employer, year, employee

估算员工更换工作（即在记录的时间段内有两个以上工作）的可能性的简单方法是，将有两个以上工作的员工人数除以员工人数

根据元组的条件进行计数

1 个答案: