我遇到以下情况:我拥有涵盖多年的统计数据,每年都是从另一个文件中检索出来的,这些文件被读入数据框。
所以基本上我有多个数据框,每年一个,看起来像下面这样:
cat geo1 geo2 geo3 0 A 0.709238 0.669532 -0.465389 1 B -1.426102 -0.152918 0.700080 2 C -0.486294 -1.619334 -1.711047 3 D 0.392837 -0.754785 -1.686076 4 E 0.603256 0.997562 -0.534222
“ cat”是产品类别,geo1到geo3是地理区域
我最终想得到的是以下内容:
year1 year2 year(n) cat region A geo1 geo2 geo3 B geo1 geo2 geo3
但是,我似乎无法完成任务。
任何提示如何做到这一点非常感谢! 最好,乔
答案 0 :(得分:1)
如果您在一个表中加载多个年份(可能是每年合并表,并使用标识符变量yearX
)
In []: df
Out[]:
cat geo1 geo2 geo3 year
0 A 0.709238 0.669532 -0.465389 year1
1 B -1.426102 -0.152918 0.700080 year1
2 C -0.486294 -1.619334 -1.711047 year1
3 D 0.392837 -0.754785 -1.686076 year1
4 E 0.603256 0.997562 -0.534222 year1
5 A 0.709238 0.669532 -0.465389 year2
6 B -1.426102 -0.152918 0.700080 year2
7 C -0.486294 -1.619334 -1.711047 year2
8 D 0.392837 -0.754785 -1.686076 year2
9 E 0.603256 0.997562 -0.534222 year2
使用melt()
整理数据:
In []: df = df.melt(value_vars=[c for c in df.columns if 'geo' in c],
var_name='region',
id_vars=['cat', 'year'])
In []: df
Out[]:
cat year region value
0 A year1 geo1 0.709238
1 B year1 geo1 -1.426102
2 C year1 geo1 -0.486294
3 D year1 geo1 0.392837
4 E year1 geo1 0.603256
5 A year2 geo1 0.709238
6 B year2 geo1 -1.426102
7 C year2 geo1 -0.486294
8 D year2 geo1 0.392837
9 E year2 geo1 0.603256
10 A year1 geo2 0.669532
11 B year1 geo2 -0.152918
12 C year1 geo2 -1.619334
13 D year1 geo2 -0.754785
14 E year1 geo2 0.997562
15 A year2 geo2 0.669532
16 B year2 geo2 -0.152918
17 C year2 geo2 -1.619334
18 D year2 geo2 -0.754785
19 E year2 geo2 0.997562
20 A year1 geo3 -0.465389
21 B year1 geo3 0.700080
22 C year1 geo3 -1.711047
23 D year1 geo3 -1.686076
24 E year1 geo3 -0.534222
25 A year2 geo3 -0.465389
26 B year2 geo3 0.700080
27 C year2 geo3 -1.711047
28 D year2 geo3 -1.686076
29 E year2 geo3 -0.534222
然后枢轴:
In []: df.pivot_table(columns='year', index=['cat', 'region'])
Out[]:
value
year year1 year2
cat region
A geo1 0.709238 0.709238
geo2 0.669532 0.669532
geo3 -0.465389 -0.465389
B geo1 -1.426102 -1.426102
geo2 -0.152918 -0.152918
geo3 0.700080 0.700080
C geo1 -0.486294 -0.486294
geo2 -1.619334 -1.619334
geo3 -1.711047 -1.711047
D geo1 0.392837 0.392837
geo2 -0.754785 -0.754785
geo3 -1.686076 -1.686076
E geo1 0.603256 0.603256
geo2 0.997562 0.997562
geo3 -0.534222 -0.534222
答案 1 :(得分:0)
获取所有数据框后,便可以从熊猫中使用pivot
。您可以找到示例here。希望能帮助到你! :)