通过数据帧之间的比较来删除行

时间:2018-04-19 15:53:45

标签: python pandas dataframe pandas-groupby

我有两个数据框nfnf1

nf如下:

    StationID   DateTime    Channel Class1Count Class2Count Class3Count Class4Count Class5Count Class6Count Class7Count ... Station-ID1 Record Type FIPS State Code Restrictions    Month   Day Year    Hour    Total Interval Volume   Classification Data Time Interval
0   1   2017-10-01 00:00:00 1   1   201 8   2   0   0   0   ... 111001  C   11  0   10  1   2017    0   212 
1   1   2017-10-01 00:00:00 2   0   138 17  2   0   0   0   ... 111002  C   11  0   10  1   2017    0   157 
2   1   2017-10-01 00:00:00 3   0   190 63  0   5   0   0   ... 111002  C   11  0   10  1   2017    0   258 
3   1   2017-10-01 00:00:00 4   0   150 8   0   0   0

nf1如下

    Class1Count Class2Count Class3Count Class4Count Class5Count Class6Count Class7Count Class8Count Class9Count Class10Count    Class11Count    Class12Count    Class13Count    Class14Count    Class15Count    Total Interval Volume
Channel                                                             
1   1.231800    217.339674  22.622814   2.015312    4.725919    0.882855    0.172724    0.777843    0.658472    0.429533    0.000053    0.000053    0.219879    0.0 0.975575    252.052506
2   2.231112    309.971548  31.127689   3.566335    12.905425   1.029141    0.129119    1.352072    0.450514    0.075925    0.000689    0.001007    0.022359    0.0 0.068878    362.931811
3   1.566203    295.166053  39.603417   8.349304    27.653974   1.021972    0.292649    1.522719    1.309444    0.674738    0.000460    0.000690    0.428506    0.0 19.633268   397.223398
4   3.503365    327.521710  18.011284   3.794444    47.587370   0.865712    0.187673    4.342154    0.977762    0.753398    0.001188    0.000198    0.599248    0.0 8.785139    416.930645
5   1.828119    290.336466  94.103376   3.224558    81.108446   1.465315    0.321380    4.821813    1.323235    0.924199    0.000618    0.000618    0.710523    0.0 3.253741    483.422406
6   1.746899    223.591279  32.923450   2.229845    5.561628    0.788566    0.878682    1.137791    0.689147    

nf1nf按渠道的平均分组。

我想删除nf中的所有行,其中class2count小于nf1中class2count的10%,并且它应该是相同的通道,这意味着它应该是针对该特定通道的

有人可以帮助我

1 个答案:

答案 0 :(得分:1)

首先创建一个系列映射通道以表示Class2Count:

s = nf1['Class2Count']

然后相应地过滤nf

nf = nf[nf['Class2Count'] > 0.1 * nf['Channel'].map(s)]