我是编程新手,但目前正在使用数据帧。我试图将当前的数据帧堆叠到特定的设计"。目前我正在处理更大的文件,有很多数据。但是,根据我的意愿,我无法堆叠()我的数据,并且形状完全混乱。我需要帮助如何定义多索引,创建更多级别。
我从我的代码中得到的结果(在stack()之前):
Exports NaN NaN NaN Net Exports NaN NaN
0 Total Sweden Norway Germany Total Sweden Norway
1 1032.8 358 239.7 435.1 636.8 274.1 9.7
2 1198.8 556.4 211.8 430.6 846.3 522.6 -1.1 `
with stack():
Exports Total NaN Sweden NaN Norway NaN Germany Net Exports Total NaN Sweden NaN Norway NaN Germany NaN GWh 1 Exports 1032.8 NaN 358 NaN 239.7 NaN 435.1 Net Exports 636.8 NaN 274.1 NaN 9.7 NaN 353
提前感谢您帮助我
答案 0 :(得分:1)
我认为你需要:
print (r.head())
Unnamed: 18 Unnamed: 19 Unnamed: 20 Unnamed: 21 Unnamed: 22 Unnamed: 23 \
0 Exports NaN NaN NaN Net Exports NaN
2 Total Sweden Norway Germany Total Sweden
189 1032.8 358 239.7 435.1 636.8 274.1
190 1198.8 556.4 211.8 430.6 846.3 522.6
191 982.7 159.3 166.2 657.2 276.3 -156.8
Unnamed: 24 Unnamed: 25 Unit:
0 NaN NaN NaN
2 Norway Germany GWh
189 9.7 353 January
190 -1.1 324.8 February
191 -105.9 539 March
#create index from column Unit
r = r.set_index('Unit:')
#create Multiindex from first and second row
#NaNs in frst row was replace by ffill - forward filling fillna()
r.columns= pd.MultiIndex.from_arrays([r.iloc[0].ffill(), r.iloc[1]], names=(None, None))
#remove first and second row
r = r.iloc[2:]
print (r.head())
Exports Net Exports
Total Sweden Norway Germany Total Sweden Norway Germany
Unit:
January 1032.8 358 239.7 435.1 636.8 274.1 9.7 353
February 1198.8 556.4 211.8 430.6 846.3 522.6 -1.1 324.8
March 982.7 159.3 166.2 657.2 276.3 -156.8 -105.9 539
April 962.3 22.1 62 878.2 -268.6 -741.3 -352.9 825.6
May 951.2 13.5 15.9 921.8 -511.5 -885.2 -496.4 870.1
print (r.stack().head(10))
Exports Net Exports
Unit:
January Germany 435.1 353
Norway 239.7 9.7
Sweden 358 274.1
Total 1032.8 636.8
February Germany 430.6 324.8
Norway 211.8 -1.1
Sweden 556.4 522.6
Total 1198.8 846.3
March Germany 657.2 539
Norway 166.2 -105.9