我想做一个简单的变量,在其中我同时执行df_dates和df_sum。可能吗? 我的意思是,我需要为所有单元格求和,但对于“日期”,我需要一个数组(列表)
import datetime
import pandas as pd
df = pd.read_csv('global.csv')
df_dates = df.groupby(['Io Id'])['Date'].apply(list)
df_sum = df.groupby(['Advertiser ID', 'Campaign Id', 'C Goal', 'C Goal KPI', 'C Goal KPI Value', 'Insertion Order', 'Io Id', 'IO Pacing', 'IO Pacing Rate', 'IO Pacing Amount', 'IO Goal Type', 'IO Goal Value', 'IO Budget Type', 'IO_Bud_Imp', 'IO_Bud_Start', 'IO_Bud_End'])['Impressions', 'Clicks', 'Click Rate (CTR)', 'Total Conversions', 'Post-Click Conversions', 'Post-View Conversions', 'Revenue (Adv Currency)'].sum()
df_dates = df_dates.to_frame()
df_first = pd.merge(df_dates, df_sum, on='Io Id')
答案 0 :(得分:0)
尝试将agg
与字典结合使用以汇总各列:
创建要累加的列列表:
collist = ['Impressions', 'Clicks', 'Click Rate (CTR)', 'Total Conversions', 'Post-Click Conversions', 'Post-View Conversions', 'Revenue (Adv Currency)']
从此列表创建字典:
dsum = {i:'sum' for i in collist}
现在通过功能列表在此字典中添加“日期”
dsum['Date'] = list
现在,将groupby与agg一起使用:
collist.append('Date')
df.groupby(['Advertiser ID', 'Campaign Id', 'C Goal', 'C Goal KPI',
'C Goal KPI Value', 'Insertion Order', 'Io Id', 'IO Pacing',
'IO Pacing Rate', 'IO Pacing Amount', 'IO Goal Type',
'IO Goal Value', 'IO Budget Type', 'IO_Bud_Imp', 'IO_Bud_Start',
'IO_Bud_End'])[collist].agg(dsum)