使用Python 3.4和Pandas,我的数据透视表如下所示:
Impressions
Day 2015-07-06 2015-07-07 2015-07-08 2015-07-09 2015-07-10 2015-07-11 2015-07-12 2015-07-13 2015-07-14 2015-07-15 2015-07-16 2015-07-17 2015-07-18 2015-07-19
Keyword
home brewing 1098 1323 2116 2574 1484 1533 1782 1615 1866 1936 1331 1274 1193 1483
使用此代码:
import pandas as pd
import numpy as np
from io import StringIO
data = StringIO('''Day Keyword Impressions Clicks Cost Avg. position Converted clicks
7/9/2015 "home brewing" 2571 6 4.13 3.1 0
7/8/2015 "home brewing" 2113 13 10.02 3.1 1
7/15/2015 "home brewing" 1933 9 9.3 2.8 0
7/14/2015 "home brewing" 1865 3 2.64 2.6 0
7/12/2015 "home brewing" 1781 7 4.93 2.6 0
7/13/2015 "home brewing" 1612 10 9.67 2.6 0
7/11/2015 "home brewing" 1530 9 9.23 2.6 0
7/10/2015 "home brewing" 1482 4 3.73 2.8 0
7/19/2015 "home brewing" 1482 5 3.26 2.5 0
7/16/2015 "home brewing" 1329 6 5.72 2.9 0
7/7/2015 "home brewing" 1318 3 2.55 2.7 0
7/17/2015 "home brewing" 1272 6 5.42 2.7 0
7/18/2015 "home brewing" 1192 5 4.5 2.5 0
7/6/2015 "home brewing" 1095 8 6.02 2.9 0
7/7/2015 "home brewing" 5 1 0.61 4 0
7/6/2015 "home brewing" 3 0 0 3.3 0
7/8/2015 "home brewing" 3 1 0.61 3.3 0
7/9/2015 "home brewing" 3 0 0 4.3 0
7/13/2015 "home brewing" 3 0 0 2.7 0
7/11/2015 "home brewing" 3 0 0 3.3 0
7/15/2015 "home brewing" 3 0 0 6.3 0
7/10/2015 "home brewing" 2 0 0 4.5 0
7/16/2015 "home brewing" 2 1 0.56 2.5 0
7/17/2015 "home brewing" 2 0 0 4 0
7/12/2015 "home brewing" 1 0 0 2 0
7/14/2015 "home brewing" 1 0 0 7 0
7/18/2015 "home brewing" 1 0 0 2 0
7/19/2015 "home brewing" 1 0 0 4 0''')
df = pd.DataFrame.from_csv(data, sep='\t')
df = df.reset_index()
pt = pd.pivot_table(df, values=['Impressions'], index=['Keyword'], columns=['Day'], aggfunc='sum')
print(pt)
我想要做的是使用Day
COLUMNS使用7天frequency
进行分组,以获得如下所示的summed
数据透视表:
Impressions
Day 2015-07-06 2015-07-13
Keyword
home brewing 11910 10698
答案 0 :(得分:1)
一种方法是使用.dt
pd.Series
获取weekofyear
并根据该列进行转移。
import pandas as pd
import numpy as np
# simulate your data
# ===================================
np.random.seed(0)
day = np.random.choice(pd.date_range('2015-07-01', '2015-07-31', freq='D'), size = 100)
impressions = np.random.randint(1, 1000, size=100)
keyword_str = ['home brewing'] * 100
df = pd.DataFrame(dict(Day=day, Keyword=keyword_str, Impressions=impressions))
df
Day Impressions Keyword
0 2015-07-13 204 home brewing
1 2015-07-16 325 home brewing
2 2015-07-22 775 home brewing
3 2015-07-01 965 home brewing
4 2015-07-04 48 home brewing
5 2015-07-28 640 home brewing
6 2015-07-04 132 home brewing
7 2015-07-08 973 home brewing
.. ... ... ...
92 2015-07-01 287 home brewing
93 2015-07-15 281 home brewing
94 2015-07-04 638 home brewing
95 2015-07-22 771 home brewing
96 2015-07-13 516 home brewing
97 2015-07-26 95 home brewing
98 2015-07-11 227 home brewing
99 2015-07-21 876 home brewing
[100 rows x 3 columns]
# processing
# ===================================
df['week_of_year'] = df['Day'].dt.weekofyear
Day Impressions Keyword week_of_year
0 2015-07-13 204 home brewing 29
1 2015-07-16 325 home brewing 29
2 2015-07-22 775 home brewing 30
3 2015-07-01 965 home brewing 27
4 2015-07-04 48 home brewing 27
5 2015-07-28 640 home brewing 31
6 2015-07-04 132 home brewing 27
7 2015-07-08 973 home brewing 28
.. ... ... ... ...
92 2015-07-01 287 home brewing 27
93 2015-07-15 281 home brewing 29
94 2015-07-04 638 home brewing 27
95 2015-07-22 771 home brewing 30
96 2015-07-13 516 home brewing 29
97 2015-07-26 95 home brewing 30
98 2015-07-11 227 home brewing 28
99 2015-07-21 876 home brewing 30
pd.pivot_table(df, index='Keyword', columns='week_of_year', values='Impressions', aggfunc=sum)
week_of_year 27 28 29 30 31
Keyword
home brewing 9656 10934 9419 14519 4320
df.set_index('Day').groupby('Keyword').resample('7D', how=sum).reset_index().pivot(index='Keyword', columns='Day', values='Impressions')
Day 2015-07-01 2015-07-08 2015-07-15 2015-07-22 2015-07-29
Keyword
home brewing 13450 9377 13191 10422 2408
答案 1 :(得分:0)
我选择了Jianxun Li的答案作为正确的答案,但我只是想发表评论,因为我确定稍后当我忘记这些时,我会自己重新审视。谢谢Jianxun!
import pandas as pd
import numpy as np
import scipy.stats as sp
from io import StringIO
data = StringIO('''Day Keyword Impressions Clicks Cost Avg. position Converted clicks
7/9/2015 "home brewing" 2571 6 4.13 3.1 0
7/8/2015 "home brewing" 2113 13 10.02 3.1 1
7/15/2015 "home brewing" 1933 9 9.3 2.8 0
7/14/2015 "home brewing" 1865 3 2.64 2.6 0
7/12/2015 "home brewing" 1781 7 4.93 2.6 0
7/13/2015 "home brewing" 1612 10 9.67 2.6 0
7/11/2015 "home brewing" 1530 9 9.23 2.6 0
7/10/2015 "home brewing" 1482 4 3.73 2.8 0
7/19/2015 "home brewing" 1482 5 3.26 2.5 0
7/16/2015 "home brewing" 1329 6 5.72 2.9 0
7/7/2015 "home brewing" 1318 3 2.55 2.7 0
7/17/2015 "home brewing" 1272 6 5.42 2.7 0
7/18/2015 "home brewing" 1192 5 4.5 2.5 0
7/6/2015 "home brewing" 1095 8 6.02 2.9 0
7/7/2015 "home brewing" 5 1 0.61 4 0
7/6/2015 "home brewing" 3 0 0 3.3 0
7/8/2015 "home brewing" 3 1 0.61 3.3 0
7/9/2015 "home brewing" 3 0 0 4.3 0
7/13/2015 "home brewing" 3 0 0 2.7 0
7/11/2015 "home brewing" 3 0 0 3.3 0
7/15/2015 "home brewing" 3 0 0 6.3 0
7/10/2015 "home brewing" 2 0 0 4.5 0
7/16/2015 "home brewing" 2 1 0.56 2.5 0
7/17/2015 "home brewing" 2 0 0 4 0
7/12/2015 "home brewing" 1 0 0 2 0
7/14/2015 "home brewing" 1 0 0 7 0
7/18/2015 "home brewing" 1 0 0 2 0
7/19/2015 "home brewing" 1 0 0 4 0''')
#Read data into dataframe
df = pd.DataFrame.from_csv(data, sep='\t', index_col=None)
#Drop unneeded columns
df = df.drop(['Clicks', 'Cost', 'Converted clicks', 'Avg. position'], axis=1)
#set 'Day' to a datetime dtype
df['Day'] = pd.to_datetime(df['Day'])
#Set index to be 'Day'
df = df.set_index('Day')
#Group by keyword
df = df.groupby('Keyword')
#Resample the index by 7 days and sum
df = df.resample('7D', how=sum)
'''df looks like this currently...
Impressions
Keyword Day
home brewing 2015-07-06 11910
2015-07-13 10698
'''
#Reset the index now that date is grouped
df = df.reset_index()
'''
Keyword Day Impressions
0 home brewing 2015-07-06 11910
1 home brewing 2015-07-13 10698
'''
#This part pivots the data to have 'Day' be columns
df = df.pivot(index='Keyword', columns='Day', values='Impressions')
print(df)
''' #End Result#
Day 2015-07-06 2015-07-13
Keyword
home brewing 11910 10698
'''