Python:删除特定的时间戳索引行(与日期无关)

时间:2014-05-01 20:11:39

标签: python pandas

我有一个具有此特定时间戳索引的DataFrame:

2011-01-07 09:30:00
2011-01-07 09:35:00
2011-01-07 09:40:00
...
2011-01-08 09:30:00
2011-01-08 09:35:00
2011-01-08 09:40:00
...
2011-01-09 09:30:00
2011-01-09 09:35:00
2011-01-09 09:40:00

如果没有经过某种循环,是否有一种快速的方法可以删除时间09:30:00的每一行而与日期无关?

2 个答案:

答案 0 :(得分:2)

构建测试框架

In [28]: df = DataFrame(np.random.randn(400,1),index=date_range('20130101',periods=400,freq='15T'))

In [29]: df = df.take(df.index.indexer_between_time('9:00','10:00'))

In [30]: df
Out[30]: 
                            0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:30:00 -0.654370
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:30:00  1.481053
2013-01-02 09:45:00  0.327030
2013-01-02 10:00:00  1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00  0.563914
2013-01-03 09:30:00 -0.117773
2013-01-03 09:45:00  0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00  0.930746
2013-01-04 09:30:00  1.088279
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625

[20 rows x 1 columns]

indexer_between_time返回我们要删除的索引,因此只需从原始索引中删除它们(这就是索引-所做的)。

In [31]: df.reindex(df.index-df.index.take(df.index.indexer_between_time('9:30:00','9:30:00')))
Out[31]: 
                            0
2013-01-01 09:00:00 -1.452507
2013-01-01 09:15:00 -0.244847
2013-01-01 09:45:00 -0.689975
2013-01-01 10:00:00 -1.506261
2013-01-02 09:00:00 -0.096923
2013-01-02 09:15:00 -1.371506
2013-01-02 09:45:00  0.327030
2013-01-02 10:00:00  1.614000
2013-01-03 09:00:00 -1.313668
2013-01-03 09:15:00  0.563914
2013-01-03 09:45:00  0.309642
2013-01-03 10:00:00 -0.386824
2013-01-04 09:00:00 -1.245194
2013-01-04 09:15:00  0.930746
2013-01-04 09:45:00 -0.927087
2013-01-04 10:00:00 -1.098625

[16 rows x 1 columns]

答案 1 :(得分:0)

你需要做类似的事情 -

>>> x = pd.DataFrame([[1,2,3,4],[3,3,3,3],[8,7,3,2],[9,9,9,4],[2,2,2,4]])
>>> x
   0  1  2  3
0  1  2  3  4
1  3  3  3  3
2  8  7  3  2
3  9  9  9  4
4  2  2  2  4

[5 rows x 4 columns]
>>> x[x[3] == 4]
   0  1  2  3
0  1  2  3  4
3  9  9  9  4
4  2  2  2  4

[3 rows x 4 columns]

在您的情况下,条件将在时间戳列上。 x[x[3] == 4]表示只获取列' 3'值为4.