我有一个包含一些值的pandas DataFrame:
id pair value subdir
taylor_1e3c_1s_56C taylor 6_13 -0.398716 run1
taylor_1e3c_1s_56C taylor 6_13 -0.397820 run2
taylor_1e3c_1s_56C taylor 6_13 -0.397310 run3
taylor_1e3c_1s_56C taylor 6_13 -0.390520 run4
taylor_1e3c_1s_56C taylor 6_13 -0.377390 run5
taylor_1e3c_1s_56C taylor 8_11 -0.393604 run1
taylor_1e3c_1s_56C taylor 8_11 -0.392899 run2
taylor_1e3c_1s_56C taylor 8_11 -0.392473 run3
taylor_1e3c_1s_56C taylor 8_11 -0.389959 run4
taylor_1e3c_1s_56C taylor 8_11 -0.387946 run5
我想要做的是隔离具有相同index
,id
和pair
的行,计算value
列上的平均值,以及把它全部放在一个新的数据框中。因为我现在已经有效地平均了subdir
的所有可能值,所以也应该删除该列。所以输出应该看起来像这样
id pair value
taylor_1e3c_1s_56C taylor 6_13 -0.392351
taylor_1e3c_1s_56C taylor 8_11 -0.391376
我应该怎么做熊猫?
答案 0 :(得分:3)
使用句法糖 - Series
mean
和索引以及groupby
df = df['value'].groupby([df.index, df['id'], df['pair']]).mean().reset_index(level=[1,2])
print (df)
id pair value
taylor_1e3c_1s_56C taylor 6_13 -0.392351
taylor_1e3c_1s_56C taylor 8_11 -0.391376
:
mean
经典解决方案 - 首先aggregate表示索引列,然后reset_index
表示列名称和groupby
df = df.reset_index().groupby(['index','id','pair'])['value'].mean().reset_index(level=[1,2])
print (df)
id pair value
index
taylor_1e3c_1s_56C taylor 6_13 -0.392351
taylor_1e3c_1s_56C taylor 8_11 -0.391376
:
print (df.reset_index())
index id pair value subdir
0 taylor_1e3c_1s_56C taylor 6_13 -0.398716 run1
1 taylor_1e3c_1s_56C taylor 6_13 -0.397820 run2
2 taylor_1e3c_1s_56C taylor 6_13 -0.397310 run3
3 taylor_1e3c_1s_56C taylor 6_13 -0.390520 run4
4 taylor_1e3c_1s_56C taylor 6_13 -0.377390 run5
5 taylor_1e3c_1s_56C taylor 8_11 -0.393604 run1
6 taylor_1e3c_1s_56C taylor 8_11 -0.392899 run2
7 taylor_1e3c_1s_56C taylor 8_11 -0.392473 run3
8 taylor_1e3c_1s_56C taylor 8_11 -0.389959 run4
9 taylor_1e3c_1s_56C taylor 8_11 -0.387946 run5
详情:
mean
汇总MultiIndex
后3 levels
获得print (df.reset_index().groupby(['index','id','pair'])['value'].mean())
index id pair
taylor_1e3c_1s_56C taylor 6_13 -0.392351
8_11 -0.391376
Name: value, dtype: float64
:
print (df.reset_index()
.groupby(['index','id','pair'])['value']
.mean()
.reset_index(level=[1,2]))
id pair value
index
taylor_1e3c_1s_56C taylor 6_13 -0.392351
taylor_1e3c_1s_56C taylor 8_11 -0.391376
将第二个蚂蚁第三级转换为列是必要的aggregate:
$pdf = new FPDI();
$pdf->AddPage();
$pdf->startTransaction(true);
$pdf->Cell(0, 0, 'blah blah blah');
$pdf->rollbackTransaction(true);
$pdf->Output( . time() . '.pdf', 'D');