我有一个2000行数据集的列表,列表中有一个国家/地区,然后是国家/地区。我想通过分解列表并将每个月,每个月将它们分组在一起来汇总所有计数。
df_grouped=df.pivot_table(index=('month','month_int', 'year'),values='views',aggfunc='max')
count period_start year month_int month Countries
1 06/08/2018 2018 6 August []
1 06/08/2018 2018 6 August ['Spain', 'Brazil', 'Porgutal', 'France', 'Romania', 'Germany#', 'Norway']
1 06/08/2018 2018 6 August ['Spain', 'Brazil', 'Porgutal', 'France', 'Romania', 'Germany#', 'Norway']
1 06/08/2018 2018 6 August ['Porgutal', 'Canada', 'USA', 'Croatia', 'Egypt', 'Netherlands', 'Swizerland', 'Japan']
2 06/08/2018 2018 6 August ['China', 'India', 'Vietnam']
1 06/08/2018 2018 6 August ['Indai', ' Pakistan', 'Mongolia']
1 06/08/2018 2018 6 August ['Indai', ' Pakistan', 'Mongolia']
1 06/08/2018 2018 6 August ['Indai', ' Pakistan', 'Mongolia']
1 06/08/2018 2018 6 August []
1 06/08/2018 2018 6 August ['Germany', 'Spain', 'China', 'USA']
6 06/08/2018 2018 6 August ['Germany', 'Spain', 'China', 'USA']
1 06/08/2018 2018 6 Sept ['Germany', 'Spain', 'China', 'USA']
5 06/08/2018 2018 6 Sept ['Germany', 'Spain', 'China', 'USA']
4 06/08/2018 2018 6 Sept ['Germany', 'Spain', 'China', 'USA']
....
我不确定如何展开“国家/地区”主题,将其每一行的总数相加,然后按国家/地区分组。
答案 0 :(得分:0)
使用.explode()
和.groupby()
。您需要reset_index()
使其成为数据帧,并传递name='Countries Count'
或任何不同于Countries
的名称;否则,将出现错误,因为列名已经存在:
df = (df.explode('Countries')
.groupby(['year','month','Countries'])['Countries'].count().reset_index(name='Countries Count'))
df
Out[1]:
year month Countries Countries Count
0 2018 August Pakistan 3
1 2018 August Brazil 2
2 2018 August Canada 1
3 2018 August China 3
4 2018 August Croatia 1
5 2018 August Egypt 1
6 2018 August France 2
7 2018 August Germany 2
8 2018 August Germany# 2
9 2018 August Indai 3
10 2018 August India 1
11 2018 August Japan 1
12 2018 August Mongolia 3
13 2018 August Netherlands 1
14 2018 August Norway 2
15 2018 August Porgutal 3
16 2018 August Romania 2
17 2018 August Spain 4
18 2018 August Swizerland 1
19 2018 August USA 3
20 2018 August Vietnam 1
21 2018 Sept China 3
22 2018 Sept Germany 3
23 2018 Sept Spain 3
24 2018 Sept USA 3