需要协助,从多个熊猫列计算唯一值

时间:2019-06-04 04:39:35

标签: python pandas

我有一个带有两个字段的pd数据框:DBA名称(设施名称)和许可证号。 DBA名称有多个列表,有些具有相同的许可证,而另一些则没有。

我想找出所有DBA名称的实例数。我也想找出它们各自有多少个唯一的许可证号。

我尝试使用value_counts(),但仅适用于pandas df中的一个字段。我也尝试使用apply(),但这没有用。

我在下面显示了示例代码。请给我您的想法。


data = data[['DBA Name','License #']]

data:
        DBA Name                    License #
1   BUSY BUMBLE BEE ACADEMY DAYCARE 2215472.0
2   BUSY BUMBLE BEE ACADEMY DAYCARE 3793.0
3   BUSY BUMBLE BEE ACADEMY DAYCARE 2215472.0
4   BUSY BUMBLE BEE ACADEMY DAYCARE 1194190.0
5   BUSY BUMBLE BEE ACADEMY DAYCARE 2215472.0
6   BUSY BUMBLE BEE ACADEMY DAYCARE 1194190.0
7   BUSY BUMBLE BEE ACADEMY DAYCARE 1194190.0
8   BUSY BUMBLE BEE ACADEMY DAYCARE 3793.0
9   BUSY BUMBLE BEE ACADEMY DAYCARE 3793.0
10  BOTTLES TO BOOKS LEARNING CENTER 1943545.0
11  BOTTLES TO BOOKS LEARNING CENTER 1943545.0
12  BOTTLES TO BOOKS LEARNING CENTER 1926534.0
13  BOTTLES TO BOOKS LEARNING CENTER    1926534.0
14  BOTTLES TO BOOKS LEARNING CENTER    1926534.0
15  BOTTLES TO BOOKS LEARNING CENTER    1943545.0
16  BOTTLES TO BOOKS LEARNING CENTER    1926534.0
17  BOTTLES TO BOOKS LEARNING CENTER    1943545.0
18  A CHILD'S WORLD EARLY LEARNING CENTER   1357825.0
19  A CHILD'S WORLD EARLY LEARNING CENTER   1357825.0
20  A CHILD'S WORLD EARLY LEARNING CENTER   1768092.0
21  A CHILD'S WORLD EARLY LEARNING CENTER   1768092.0
22  A CHILD'S WORLD EARLY LEARNING CENTER   1357825.0
23  A CHILD'S WORLD EARLY LEARNING CENTER   1768092.0
24  A CHILD'S WORLD EARLY LEARNING CENTER   1357825.0

1 个答案:

答案 0 :(得分:2)

pd.DataFrame.groupbynuniqueagg一起使用:

import pandas as pd

df.groupby('DBA Name').agg({'DBA Name': 'count', 'License #': 'nunique'})

输出:

                                       DBA Name  License #
DBA Name                                                  
A CHILD'S WORLD EARLY LEARNING CENTER         7          2
BOTTLES TO BOOKS LEARNING CENTER              8          2
BUSY BUMBLE BEE ACADEMY DAYCARE               9          3