Question

我的交易总额按date_month，device和channel分组，

date_month   device            channel  transactions
2017-01-01  desktop         AFFILIATES           413
2017-01-01   mobile         AFFILIATES           501
2017-01-01    other         AFFILIATES            22
2017-01-01   tablet         AFFILIATES           250
2017-01-01  desktop             DIRECT         13979
etc...       etc...             etc...        etc...

date_month的范围是从2017-01-01到当前日期

我要做的是将device的{{1}}字段拆分为other，mobile或desktop

示例过程：

将值tablet作为附加列（'other'）的枢轴设备transactions
取other_transactions和transactions和date_month（channel）分区/分组的总和
然后将total_transactions除以transactions以得到总计百分比（total_transactions）
将percent_total和other_transactions相乘得到percent_total
将other_split添加到other_split以获取更新的交易字段

获取总数并应用简单的数学运算应该不是问题。我会按照transactions的方式进行操作以获得df['total_transactions']=df.groupby(['date_month', 'channel'])['transactions'].transform('sum')，但是我遇到的问题是将total_transactions交易放入这样的单独列中

other

最后，我希望有一个数据框，该数据框将从date_month device channel transactions other_trans 2017-01-01 desktop AFFILIATES 413 22 2017-01-01 mobile AFFILIATES 501 22 2017-01-01 tablet AFFILIATES 250 22 2017-01-01 desktop DIRECT 13979 etc etc... etc... etc... etc...列中删除other个设备，并使用其交易量根据该{ {1}}和device

Answer 1

IIUC，您可以首先使用groupby创建另一个数据框，使用others删除行，然后执行merge：

import pandas as pd

df = pd.DataFrame({'date_month': {0: '2017-01-01', 1: '2017-01-01', 2: '2017-01-01', 3: '2017-01-01', 4: '2017-01-01', 5:"2017-01-01"},
                   'device': {0: 'desktop', 1: 'mobile', 2: 'other', 3: 'tablet', 4: 'desktop', 5:"other"},
                   'channel': {0: 'AFFILIATES', 1: 'AFFILIATES', 2: 'AFFILIATES', 3: 'AFFILIATES', 4: 'DIRECT', 5: 'DIRECT'},
                   'transactions': {0: 413, 1: 501, 2: 22, 3: 250, 4: 13979, 5: 234}})

other = df.groupby("device").get_group("other")[["date_month","channel","transactions"]]

df = df.drop(df[df["device"].str.contains("other")].index)

df = df.merge(other, on=["date_month","channel"], how="left", suffixes=("","_other"))

print (df)

结果：

   date_month   device     channel  transactions  transactions_other
0  2017-01-01  desktop  AFFILIATES           413                  22
1  2017-01-01   mobile  AFFILIATES           501                  22
2  2017-01-01   tablet  AFFILIATES           250                  22
3  2017-01-01  desktop      DIRECT         13979                 234

如何根据总数百分比拆分字段值

1 个答案: