我有一个像这样格式化的pandas DataFrame:
mesh 1 energy low [eV] energy high [eV] nuclide score mean
x y z
0 1 1 1 1.00e-03 2.00e+07 total flux 0.00e+00
1 1 1 2 1.00e-03 2.00e+07 total flux 1.82e-03
2 1 1 3 1.00e-03 2.00e+07 total flux 6.96e-03
3 1 1 4 1.00e-03 2.00e+07 total flux 1.47e-03
4 1 1 5 1.00e-03 2.00e+07 total flux 6.93e-03
5 1 1 6 1.00e-03 2.00e+07 total flux 8.73e-03
6 1 1 7 1.00e-03 2.00e+07 total flux 1.34e-02
7 1 1 8 1.00e-03 2.00e+07 total flux 1.16e-02
8 1 1 9 1.00e-03 2.00e+07 total flux 4.14e-03
9 1 1 10 1.00e-03 2.00e+07 total flux 5.26e-03
10 1 2 1 1.00e-03 2.00e+07 total flux 6.16e-03
11 1 2 2 1.00e-03 2.00e+07 total flux 1.76e-02
12 1 2 3 1.00e-03 2.00e+07 total flux 1.80e-02
13 1 2 4 1.00e-03 2.00e+07 total flux 1.97e-02
14 1 2 5 1.00e-03 2.00e+07 total flux 1.76e-02
15 1 2 6 1.00e-03 2.00e+07 total flux 1.90e-02
16 1 2 7 1.00e-03 2.00e+07 total flux 3.53e-02
17 1 2 8 1.00e-03 2.00e+07 total flux 0.00e+00
18 1 2 9 1.00e-03 2.00e+07 total flux 0.00e+00
19 1 2 10 1.00e-03 2.00e+07 total flux 0.00e+00
20 1 3 1 1.00e-03 2.00e+07 total flux 0.00e+00
21 1 3 2 1.00e-03 2.00e+07 total flux 0.00e+00
22 1 3 3 1.00e-03 2.00e+07 total flux 0.00e+00
23 1 3 4 1.00e-03 2.00e+07 total flux 0.00e+00
24 1 3 5 1.00e-03 2.00e+07 total flux 0.00e+00
25 1 3 6 1.00e-03 2.00e+07 total flux 0.00e+00
26 1 3 7 1.00e-03 2.00e+07 total flux 0.00e+00
27 1 3 8 1.00e-03 2.00e+07 total flux 0.00e+00
28 1 3 9 1.00e-03 2.00e+07 total flux 0.00e+00
29 1 3 10 1.00e-03 2.00e+07 total flux 0.00e+00
... ... ... .. ... ... ... ... ...
99970 100 98 1 1.00e-03 2.00e+07 total flux 0.00e+00
99971 100 98 2 1.00e-03 2.00e+07 total flux 0.00e+00
99972 100 98 3 1.00e-03 2.00e+07 total flux 0.00e+00
99973 100 98 4 1.00e-03 2.00e+07 total flux 0.00e+00
99974 100 98 5 1.00e-03 2.00e+07 total flux 0.00e+00
99975 100 98 6 1.00e-03 2.00e+07 total flux 0.00e+00
99976 100 98 7 1.00e-03 2.00e+07 total flux 0.00e+00
99977 100 98 8 1.00e-03 2.00e+07 total flux 0.00e+00
99978 100 98 9 1.00e-03 2.00e+07 total flux 0.00e+00
99979 100 98 10 1.00e-03 2.00e+07 total flux 0.00e+00
99980 100 99 1 1.00e-03 2.00e+07 total flux 0.00e+00
99981 100 99 2 1.00e-03 2.00e+07 total flux 0.00e+00
99982 100 99 3 1.00e-03 2.00e+07 total flux 0.00e+00
99983 100 99 4 1.00e-03 2.00e+07 total flux 0.00e+00
99984 100 99 5 1.00e-03 2.00e+07 total flux 0.00e+00
99985 100 99 6 1.00e-03 2.00e+07 total flux 0.00e+00
99986 100 99 7 1.00e-03 2.00e+07 total flux 0.00e+00
99987 100 99 8 1.00e-03 2.00e+07 total flux 0.00e+00
99988 100 99 9 1.00e-03 2.00e+07 total flux 0.00e+00
99989 100 99 10 1.00e-03 2.00e+07 total flux 0.00e+00
99990 100 100 1 1.00e-03 2.00e+07 total flux 0.00e+00
99991 100 100 2 1.00e-03 2.00e+07 total flux 0.00e+00
99992 100 100 3 1.00e-03 2.00e+07 total flux 0.00e+00
99993 100 100 4 1.00e-03 2.00e+07 total flux 0.00e+00
99994 100 100 5 1.00e-03 2.00e+07 total flux 0.00e+00
99995 100 100 6 1.00e-03 2.00e+07 total flux 0.00e+00
99996 100 100 7 1.00e-03 2.00e+07 total flux 0.00e+00
99997 100 100 8 1.00e-03 2.00e+07 total flux 0.00e+00
99998 100 100 9 1.00e-03 2.00e+07 total flux 0.00e+00
99999 100 100 10 1.00e-03 2.00e+07 total flux 0.00e+00
RangeIndex(start=0, stop=100000, step=1)
MultiIndex(levels=[['energy high [eV]', 'energy low [eV]', 'mean', 'mesh 1', 'nuclide', 'score', 'std. dev.'], ['', 'x', 'y', 'z']],
labels=[[3, 3, 3, 1, 0, 4, 5, 2, 6], [1, 2, 3, 0, 0, 0, 0, 0, 0]])
我想在一个列表中有10个pandas数据帧(因为'网格1',' z'变为10),在每个数据帧中行是('网格1',' y'),列是('网格1',' x'),值是' mean' ;。我已经想出如何在列表中获取10个数据帧:
axial_dfs = []
for i in range(10):
temp_df = flux_df[flux_df['mesh 1']['z'] == i]
axial_dfs.append(temp_df)
但我无法弄清楚如何更改行和列。我会尝试使用pivot,但我不知道如何使用MultiIndex进行网格1'。
感谢所有帮助!谢谢!
答案 0 :(得分:1)
我对您的需求感到有些困惑,但我认为将temp_df
中的列级合并在一起会对您有所帮助:
axial_dfs = []
for i in range(10):
temp_df = flux_df[flux_df['mesh 1']['z'] == i]
temp_df.columns = temp_df.columns.map('_'.join) # add this line
axial_dfs.append(temp_df)
现在,axial_dfs
中的所有框架都会有一个级别的列(例如mesh 1_x
或mesh 1_y
),这听起来就像你自己操作一样舒服(使用pandas.DataFrame.pivot_table
或pandas.DataFrame.groupby
)。
答案 1 :(得分:0)
在以下示例中,我使用unstack
将第二个索引级别转换为列索引。然后,我使用列表推导将结果拆分为由第一个索引的级别确定的列表。
import pandas as pd
import numpy as np
# Create simple example
data = np.random.randint(8, size=(8, 2))
levels = [['df1', 'df2'], ['a', 'b'], [1, 2]]
idx = pd.MultiIndex.from_product(levels, names=['first', 'second', 'third'])
df = pd.DataFrame(data, index=idx, columns=['col1', 'col2'])
# Step 1: unstack to get second level as column index
df = df.unstack(level='second')['col2']
# Step 2: get a list of chunks of df by first index level
first_unique = df.index.get_level_values('first').unique()
df_ls = [df.loc[x] for x in first_unique]