pandas如何通过数据框列值获取行索引

时间:2016-04-14 20:55:36

标签: loops pandas indexing

有我的数据框。 我有“操作”列,这些列中的值定义循环。例如,Operation = 4定义循环的开始,持续时间为1,Operation = 9是该循环的结束。对于A2和A3循环等等。如果需要A1,A2,A3等,如何通过开始和结束操作定义越来越多的循环,我如何插入值并添加新列。只是要强调,在重叠的情况下,如果已经存在,则想要创建新列或添加到现有的下一列中。谢谢你的帮助

enter image description here

1 个答案:

答案 0 :(得分:0)

<强>更新

In [136]: df
Out[136]:
   a  b  c  operation
0  8  7  5          1
1  3  1  0          2
2  5  4  0          3
3  9  7  7          4
4  4  6  8          5
5  7  8  0          6
6  1  8  8          7
7  0  9  0          8
8  9  9  7          9
9  0  7  3         10

In [137]: df['A1'] = ''

In [138]: df.ix[(df.index >= df[df.operation == 4].index.tolist()) & \
   .....:       (df.index <= df[df.operation == 9].index.tolist()), 'A1'] = 1

In [139]:

In [139]:

In [139]: df['A2'] = ''

In [140]: df.ix[(df.index >= df[df.operation == 2].index.tolist()) & \
   .....:       (df.index <= df[df.operation == 7].index.tolist()), 'A2'] = 2

In [141]: df
Out[141]:
   a  b  c  operation A1 A2
0  8  7  5          1
1  3  1  0          2     2
2  5  4  0          3     2
3  9  7  7          4  1  2
4  4  6  8          5  1  2
5  7  8  0          6  1  2
6  1  8  8          7  1  2
7  0  9  0          8  1
8  9  9  7          9  1
9  0  7  3         10

OLD回答:

In [93]: df
Out[93]:
   a  b  c OPERATION
0  1  1  0       op2
1  1  5  6       op2
2  7  7  2     START
3  1  6  2       op2
4  7  7  4       op2
5  3  2  0       op2
6  6  9  9       op1
7  6  1  4       END
8  5  3  6       op1
9  9  2  9       op3

In [94]: df[(df.index >= df[df.OPERATION == 'START'].index.tolist()) & \
   ....:    (df.index <= df[df.OPERATION == 'END'].index.tolist())]
Out[94]:
   a  b  c OPERATION
2  7  7  2     START
3  1  6  2       op2
4  7  7  4       op2
5  3  2  0       op2
6  6  9  9       op1
7  6  1  4       END

说明:

In [53]: df.OPERATION == 'START'
Out[53]:
0    False
1    False
2     True
3    False
4    False
5    False
6    False
7    False
8    False
9    False
Name: OPERATION, dtype: bool

In [54]: df[df.OPERATION == 'START'].index.tolist()
Out[54]: [2]

In [55]: df.index >= df[df.OPERATION == 'START'].index.tolist()
Out[55]: array([False, False,  True,  True,  True,  True,  True,  True,  True,  True], dtype=bool)