Question

我设法将下表放入pandas DataFrame中。它具有多维索引（file_type，server_count，file_count，thread_count，cacheclear_type），表示某些性能度量的配置。然后我为每个配置运行了5次。

+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
|           |              |            |              |                 | run_001 | run_002 | run_003 | run_004 | run_005 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| file_type | server_count | file_count | thread_count | cacheclear_type |         |         |         |         |         |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| gor       | 01servers    | 05files    | 20threads    | ccALWAYS        | 15.918  | 16.275  | 15.807  | 17.781  | 16.233  |
|           | 08servers    | 05files    | 20threads    | ccALWAYS        | 17.061  | 15.414  | 16.819  | 15.597  | 16.818  |
| gorz      | 01servers    | 05files    | 20threads    | ccALWAYS        | 12.285  | 11.218  | 12.009  | 14.122  | 10.991  |
|           | 08servers    | 05files    | 20threads    | ccALWAYS        | 9.881   | 9.405   | 9.322   | 10.184  | 9.924   |
| gor       | 01servers    | 10files    | 20threads    | ccALWAYS        | 17.322  | 17.636  | 16.096  | 16.484  | 16.715  |
|           | 08servers    | 10files    | 20threads    | ccALWAYS        | 17.167  | 17.666  | 15.950  | 18.867  | 16.569  |
| gorz      | 01servers    | 10files    | 20threads    | ccALWAYS        | 14.718  | 19.553  | 17.930  | 21.415  | 21.495  |
|           | 08servers    | 10files    | 20threads    | ccALWAYS        | 10.236  | 9.948   | 12.605  | 9.780   | 10.320  |
| gor       | 01servers    | 15files    | 20threads    | ccALWAYS        | 19.265  | 17.128  | 17.630  | 18.739  | 16.833  |
|           | 08servers    | 15files    | 20threads    | ccALWAYS        | 23.083  | 22.084  | 25.024  | 24.677  | 20.648  |
| gorz      | 01servers    | 15files    | 20threads    | ccALWAYS        | 15.401  | 28.282  | 28.727  | 24.645  | 27.509  |
|           | 08servers    | 15files    | 20threads    | ccALWAYS        | 10.307  | 12.217  | 13.005  | 12.277  | 12.224  |
| gor       | 01servers    | 20files    | 20threads    | ccALWAYS        | 23.744  | 20.539  | 21.416  | 22.921  | 22.794  |
|           | 08servers    | 20files    | 20threads    | ccALWAYS        | 35.393  | 36.218  | 35.949  | 35.157  | 37.342  |
| gorz      | 01servers    | 20files    | 20threads    | ccALWAYS        | 19.505  | 23.756  | 25.767  | 26.575  | 25.239  |
|           | 08servers    | 20files    | 20threads    | ccALWAYS        | 11.398  | 11.332  | 15.086  | 16.115  | 13.479  |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+

我想采用所有gor，1servers，20threads，ccALWAYS配置，并为每个XXfiles配置创建一个数据点。首先，我想以某种方式获得一个如下所示的DataFrame：

+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
|           |              |            |              |                 | run_001 | run_002 | run_003 | run_004 | run_005 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| file_type | server_count | file_count | thread_count | cacheclear_type |         |         |         |         |         |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| gor       | 01servers    | 05files    | 20threads    | ccALWAYS        | 15.918  | 16.275  | 15.807  | 17.781  | 16.233  |
| gor       | 01servers    | 10files    | 20threads    | ccALWAYS        | 17.322  | 17.636  | 16.096  | 16.484  | 16.715  |
| gor       | 01servers    | 15files    | 20threads    | ccALWAYS        | 19.265  | 17.128  | 17.630  | 18.739  | 16.833  |
| gor       | 01servers    | 20files    | 20threads    | ccALWAYS        | 23.744  | 20.539  | 21.416  | 22.921  | 22.794  |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+

我该怎么做？

Answer 1

我设法使用query（）函数过滤数据，使用以下代码使其看起来像问题中的第二个表：

df.query('file_type == "gor" & server_count == "01servers"').sortlevel(2)

从pandas中的MultiIndex DataFrame中提取和绘制数据

1 个答案: