I have a dataframe that I need to retrieve many metrics from. Dataframe columns are the following:
Consumer_ID|Client|Campaign|Date
I am trying to get the unique count of the consumer_ID column for various combinations of the Client, Campaign, and Date columns. So far I have come up with two solutions:
My question: is there a cleaner more Pythonic way of getting the unique count of one column for all available combinations of other columns?
Example (annoying) solution using groupbys: Yes understood, but is there a more pythonic way to get every combination of the groupby columns? For example, right now to get all combinations I'd have to write:
df.groupby(['Client']).Consumer_ID.nunique()
df.groupby(['Client', 'Campaign']).Consumer_ID.nunique()
df.groupby(['Client', 'Campaign', 'Date']).Consumer_ID.nunique()
df.groupby(['Client', 'Date'].Consumer_ID.nunique()
答案 0 :(得分:1)
If I understand correctly:
df.groupby(df.columns.drop(Consumer_ID).tolist(), as_index=False).nunique()
答案 1 :(得分:0)
I believe what you're looking for is:
df.groupby(['Client', 'Campaign', 'Date']).Consumer_ID.nunique()
答案 2 :(得分:0)
您可以使用数据透视表,如下所示:
将熊猫作为pd导入 pd.pivot_table(df,index = ['Client','Campaign','Date'],values ='Consumer_ID',aggfunc = pd.Series.nunique)
答案 3 :(得分:0)
回答了我自己的问题。我使用itertools组合创建了所有可能的列组合,然后将其用于完成所有groupby聚合。下面的示例代码:
from itertools import combinations
cols = df.columns
combinations = [j for i in range(len(cols)) for j in combinations(cols, i+1)]
然后,我可以使用“组合”列表中列的不同组合来完成所有groupby聚合,而不必多次编写groupby语句。
谢谢!