如何将整个数据帧划分为某些行?

时间:2017-10-22 17:37:42

标签: python pandas

我有一个类似于下面的数据框,除了更长的时间。最终,VarTypeLevel合并后,代表唯一条目。我希望根据适当的分组将未曝光的条目与数据框中的其他条目分开(例如,“任何全部暴露'将除以'任何全部未暴露的'”,而'任何现有的已曝光'将被“任何现有未曝光”划分。'

Var   Type      Level           Metric1 Metric2 Metric3
Any   All       Unexposed        34842  30783   -12
Any   All       Exposed          54167  54247   0.15
Any   All       LowExposure      20236  20311   0.37
Any   All       MediumExposure   15254  15388   0.87
Any   All       HighExposure     18677  18548   0.7
Any   New       Unexposed        0      23785   0
Any   New       Exposed          0      43030   0
Any   New       LowExposure      0      16356   0
Any   New       MediumExposure   0      12213   0
Any   New       HighExposure     0      14461   0
Any   Existing  Unexposed       34843   6998    -80
Any   Existing  Exposed         54167   11217   -80
Any   Existing  LowExposure     20236   3955    -81
Any   Existing  MediumExposure  15254   3175    -79
Any   Existing  HighExposure    18677   4087    -78

我认为,最简单的方法就是创建一个多重索引,但我尝试了多种方法无济于事(通常会收到一个错误,它可以分为多个方法)非唯一索引)。

预期结果类似于,每行中根据var和type值除以Unexposed行。

Var   Type      Level           Metric1 Metric2 Metric3  MP1  MP2     MP3
Any   All       Unexposed        34842  30783   -12      1.00  1.00   1.00
Any   All       Exposed          54167  54247   0.15     1.55  1.76  -0.01
Any   All       LowExposure      20236  20311   0.37     0.58  0.66  -0.03
Any   All       MediumExposure   15254  15388   0.87     0.44  0.50  -0.07
Any   All       HighExposure     18677  18548   0.7      0.54  0.60  -0.06
Any   New       Unexposed        0      23785   0        0.00  1.00   0.00
Any   New       Exposed          0      43030   0        0.00  1.81   0.00
Any   New       LowExposure      0      16356   0        0.00  0.69   0.00
Any   New       MediumExposure   0      12213   0        0.00  0.51   0.00
Any   New       HighExposure     0      14461   0        0.00  0.61   0.00
Any   Existing  Unexposed       34843   6998    -80      1.00  1.00   1.00
Any   Existing  Exposed         54167   11217   -80      1.55  1.60   1.00
Any   Existing  LowExposure     20236   3955    -81      0.58  0.57   1.01
Any   Existing  MediumExposure  15254   3175    -79      0.44  0.45   0.99
Any   Existing  HighExposure    18677   4087    -78      0.54  0.58   0.98

2 个答案:

答案 0 :(得分:0)

我不确定我是否正确使用它。会这样做吗? 您可以解析所有独特的组合并执行除法。

#!/bin/bash
SESSION=OpenC2X
CURR_DIR=$(pwd)
OPENC2X=$CURR_DIR/..
BUILD_DIR=$OPENC2X/build/
GLOBAL_CONFIG=$OPENC2X/common/config/config.xml

LOCAL_CONFIG_RELATIVE=config/config.xml
LOGGING_CONF=config/logging.conf
STATISTICS_CONF=config/statistics.conf

tmux -2 new-session -d -s $SESSION


tmux set-option -g mouse

tmux new-window -t $SESSION:1 -n 'App'

tmux split-window -h
tmux select-pane -t 0

tmux send-keys "cd $BUILD_DIR/cam/src" C-m
tmux send-keys "./cam $GLOBAL_CONFIG $OPENC2X/cam/$LOCAL_CONFIG_RELATIVE $OPENC2X/cam/$LOGGING_CONF $OPENC2X/cam/$STATISTICS_CONF" C-m
tmux split-window -v

tmux send-keys "cd $BUILD_DIR/httpServer/src" C-m
tmux send-keys "./httpServer $GLOBAL_CONFIG $OPENC2X/httpServer/$LOCAL_CONFIG_RELATIVE $OPENC2X/httpServer/$LOGGING_CONF $OPENC2X/httpServer/$STATISTICS_CONF" C-m
tmux split-window -v

tmux send-keys "cd $BUILD_DIR/ldm/src" C-m
tmux send-keys "rm ../db/ldm-*.db" C-m
tmux send-keys "./ldm $GLOBAL_CONFIG $OPENC2X/ldm/$LOGGING_CONF $OPENC2X/ldm/$STATISTICS_CONF" C-m
tmux split-window -v

tmux kill-pane
tmux select-pane -t 3

tmux send-keys "cd $BUILD_DIR/denm/src" C-m
tmux send-keys "./denm $GLOBAL_CONFIG $OPENC2X/denm/$LOGGING_CONF $OPENC2X/denm/$STATISTICS_CONF" C-m
tmux split-window -v

tmux send-keys "cd $BUILD_DIR/dcc/src" C-m
tmux send-keys "sudo ./dcc $GLOBAL_CONFIG $OPENC2X/dcc/$LOCAL_CONFIG_RELATIVE $OPENC2X/dcc/$LOGGING_CONF $OPENC2X/dcc/$STATISTICS_CONF" C-m
tmux split-window -v

tmux send-keys "cd $BUILD_DIR/obd2/src" C-m
tmux send-keys "./obd2 $GLOBAL_CONFIG $OPENC2X/obd2/$LOCAL_CONFIG_RELATIVE $OPENC2X/obd2/$LOGGING_CONF $OPENC2X/obd2/$STATISTICS_CONF" C-m
tmux split-window -v

tmux send-keys "cd $BUILD_DIR/gps/src" C-m
tmux send-keys "./gpsService $GLOBAL_CONFIG $OPENC2X/gps/$LOCAL_CONFIG_RELATIVE $OPENC2X/gps/$LOGGING_CONF $OPENC2X/gps/$STATISTICS_CONF" C-m

tmux -2 attach-session -t $SESSION

答案 1 :(得分:0)

要按特定Var划分每个Type / Level分组中的每一行,请使用groupbydivide

例如,除以Unexposed,如示例输出中所示:

def divide_by(g, denom_lvl):
    cols = ["Metric1", "Metric2", "Metric3"]
    num = g[cols]
    denom = g.loc[g.Level==denom_lvl, cols].iloc[0]
    return num.divide(denom).fillna(0).round(2)

df.groupby(['Var','Type']).apply(divide_by, denom_lvl='Unexposed')

输出:

    Metric1  Metric2  Metric3
0      1.00     1.00     1.00
1      1.55     1.76    -0.01
2      0.58     0.66    -0.03
3      0.44     0.50    -0.07
4      0.54     0.60    -0.06
5      0.00     1.00     0.00
6      0.00     1.81     0.00
7      0.00     0.69     0.00
8      0.00     0.51     0.00
9      0.00     0.61     0.00
10     1.00     1.00     1.00
11     1.55     1.60     1.00
12     0.58     0.57     1.01
13     0.44     0.45     0.99
14     0.54     0.58     0.98