保持每个日期时间组的值

时间:2018-03-17 11:35:08

标签: python pandas

                     A  B    C    D
0  2002-01-13 15:00:00  X  110  3.9
1  2002-01-13 15:00:00  Y  120  1.9
2  2002-01-13 15:00:00  X  130  8.0 
3  2002-01-13 15:00:00  X  140  9.0
4  2002-01-14 16:00:00  X  110  0.2
5  2002-01-14 16:00:00  Y  120  7.0
6  2002-01-14 16:00:00  X  130  1.6
7  2002-01-14 16:00:00  X  140  3.4

我想创建一个新列df["E"],当 B = Y 时,该列 D值,并为每个 A组保留此值< / em>的

输出应为:

                     A  B    C    D     E
0  2002-01-13 15:00:00  X  110  3.9   1.9
1  2002-01-13 15:00:00  Y  120  1.9   1.9
2  2002-01-13 15:00:00  X  130  8.0   1.9
3  2002-01-13 15:00:00  X  140  9.0   1.9
4  2002-01-14 16:00:00  X  110  0.2   7.0
5  2002-01-14 16:00:00  Y  120  7.0   7.0
6  2002-01-14 16:00:00  X  130  1.6   7.0
7  2002-01-14 16:00:00  X  140  3.4   7.0

1 个答案:

答案 0 :(得分:1)

选项1:

In [8]: df.merge(df.loc[df.B=='Y', ['A', 'D']].rename(columns={'D':'E'}))
Out[8]:
                    A  B    C    D    E
0 2002-01-13 15:00:00  X  110  3.9  1.9
1 2002-01-13 15:00:00  Y  120  1.9  1.9
2 2002-01-13 15:00:00  X  130  8.0  1.9
3 2002-01-13 15:00:00  X  140  9.0  1.9
4 2002-01-14 16:00:00  X  110  0.2  7.0
5 2002-01-14 16:00:00  Y  120  7.0  7.0
6 2002-01-14 16:00:00  X  130  1.6  7.0
7 2002-01-14 16:00:00  X  140  3.4  7.0

选项2:

In [35]: df['E'] = df['A'].map(df.loc[df.B=='Y', ['A', 'D']].set_index('A')['D'])

In [36]: df
Out[36]:
                    A  B    C    D    E
0 2002-01-13 15:00:00  X  110  3.9  1.9
1 2002-01-13 15:00:00  Y  120  1.9  1.9
2 2002-01-13 15:00:00  X  130  8.0  1.9
3 2002-01-13 15:00:00  X  140  9.0  1.9
4 2002-01-14 16:00:00  X  110  0.2  7.0
5 2002-01-14 16:00:00  Y  120  7.0  7.0
6 2002-01-14 16:00:00  X  130  1.6  7.0
7 2002-01-14 16:00:00  X  140  3.4  7.0