如何按日期分组并将每个组唯一,并用熊猫计算每个组

时间:2018-06-06 12:48:11

标签: pandas pandas-groupby

如何按日期分组并将每个组唯一一组并用pandas计算每个组?

每天计算唯一MAC地址的数量

pd.concat([df[['date','Client MAC']], df8[['date',"MAC address"]].rename(columns={"MAC address":"Client MAC"})]).groupby(["date"])


one of column , data example
Association Time
Mon May 14 19:41:20 HKT 2018
Mon May 14 19:43:22 HKT 2018
Tue May 15 09:24:57 HKT 2018
Mon May 14 19:53:33 HKT 2018

我用

starttime=datetime.datetime.now()
dff4 = (df4[['Association Time','Client MAC Address']].groupby(pd.to_datetime(df4["Association Time"]).dt.date.apply(lambda x: dt.datetime.strftime(x, '%Y-%m-%d'))).nunique())
print datetime.datetime.now()-starttime

它运行2分钟,但它也按协会时间分组,这是错误的, 不需要按协会时间分组

                  Association Time  Client MAC Address
Association Time
2017-06-21                       1                   3
2018-02-21                       2                   8
2018-02-27                       1                   1
2018-03-07                       3                   3

1 个答案:

答案 0 :(得分:0)

我认为需要添加['Client MAC'].nunique()

df = (pd.concat([df[['date','Client MAC']], 
           df8[['date',"MAC address"]].rename(columns={"MAC address":"Client MAC"})])
        .groupby(["date"])['Client MAC']
        .nunique())

如果date是日期时间:

df = (pd.concat([df[['date','Client MAC']], 
                df8[['date',"MAC address"]].rename(columns={"MAC address":"Client MAC"})]))

df = df['Client MAC'].groupby(df["date"].dt.date).nunique()