pandas groupby在多个列上

时间:2017-11-30 23:36:44

标签: python pandas data-science data-cleaning

我有一个包含州代码及其状态的数据集。

  code  status
1   AZ  a
2   CA  b
3   KS  c
4   MO  c
5   NY  d
6   AZ  d
7   MO  a
8   MO  b
9   MN  b
10  NV  a
11  NV  e
12  MO  f
13  NY  a
14  NY  a
15  NY  b

我想过滤掉这个代码只包含a状态的数据集,并计算它们的数量。示例输出将是,

  code  status  
1   AZ  a   
2   MO  a   
3   NY  a   

    AZ =1   MO = 1  NY =2

我使用df.groupyby("code").loc[df.status == 'a']但没有运气。 任何帮助表示赞赏!

2 个答案:

答案 0 :(得分:2)

让我们首先过滤数据帧a,然后是groupby和count。

df[df.status == 'a'].groupby('code').size()

输出:

code
AZ    1
MO    1
NV    1
NY    2
dtype: int64

答案 1 :(得分:0)

我重新创建了数据集

data = [["AZ","CA", "KS","MO","NY","AZ","MO","MO","MN","NV","NV","MO","NY","NY" ,"NY"],
       ["a","b","c","c","d","d","a","b","b","a","e","f","a","a","b"]]


df = pd.DataFrame(data)
df = df.T
df.columns = ["code","status" ]

df[df["status"] == "a"].groupby(["code", "status"]).size()

给出

code  status
AZ    a         1
MO    a         1
NV    a         1
NY    a         2
dtype: int64