获取pandas中2个相应系列的min和max元素

时间:2013-08-26 22:27:27

标签: python pandas

假设我在熊猫中有2个系列:

from datetime import datetime, timedelta
import pandas as pd
d = datetime.now()
index = [d + timedelta(seconds = i) for i in range(5)]
a = pd.Series([1,4,5,7,8], index = index)
b = pd.Series([2,3,6,7,8], index = index)

获取相应索引元素的最小/最大值的最佳方法是什么。 像:

min_func(a, b): [1,3,5,7,8] (for given index)
max_func(a, b): [2,4,6,7,8]

我在文档中找到的唯一函数是在系列中返回min / max的min / max函数,而.apply函数不接受index参数。 有没有更好的方法来实现,没有手动系列迭代或一些算术魔术(如min_func:a *(a = a))

由于

1 个答案:

答案 0 :(得分:7)

将系列组合成一个自动与索引对齐的框架

In [51]: index
Out[51]: 
[datetime.datetime(2013, 8, 26, 18, 33, 48, 990974),
 datetime.datetime(2013, 8, 26, 18, 33, 49, 990974),
 datetime.datetime(2013, 8, 26, 18, 33, 50, 990974),
 datetime.datetime(2013, 8, 26, 18, 33, 51, 990974),
 datetime.datetime(2013, 8, 26, 18, 33, 52, 990974)]

In [52]: a = pd.Series([1,4,5,7,8], index = index)

In [53]: b = pd.Series([2,3,6,7,8], index = index)

In [54]: a
Out[54]: 
2013-08-26 18:33:48.990974    1
2013-08-26 18:33:49.990974    4
2013-08-26 18:33:50.990974    5
2013-08-26 18:33:51.990974    7
2013-08-26 18:33:52.990974    8
dtype: int64

In [55]: b
Out[55]: 
2013-08-26 18:33:48.990974    2
2013-08-26 18:33:49.990974    3
2013-08-26 18:33:50.990974    6
2013-08-26 18:33:51.990974    7
2013-08-26 18:33:52.990974    8
dtype: int64

In [56]: df = DataFrame({ 'a' : a, 'b' : b })

In [57]: df
Out[57]: 
                            a  b
2013-08-26 18:33:48.990974  1  2
2013-08-26 18:33:49.990974  4  3
2013-08-26 18:33:50.990974  5  6
2013-08-26 18:33:51.990974  7  7
2013-08-26 18:33:52.990974  8  8

最小/最大

In [9]: df.max(1)
Out[9]: 
2013-08-26 18:33:48.990974    2
2013-08-26 18:33:49.990974    4
2013-08-26 18:33:50.990974    6
2013-08-26 18:33:51.990974    7
2013-08-26 18:33:52.990974    8
Freq: S, dtype: int64

In [10]: df.min(1)
Out[10]: 
2013-08-26 18:33:48.990974    1
2013-08-26 18:33:49.990974    3
2013-08-26 18:33:50.990974    5
2013-08-26 18:33:51.990974    7
2013-08-26 18:33:52.990974    8
Freq: S, dtype: int64

最小/最大指数

In [11]: df.idxmax(1)
Out[11]: 
2013-08-26 18:33:48.990974    b
2013-08-26 18:33:49.990974    a
2013-08-26 18:33:50.990974    b
2013-08-26 18:33:51.990974    a
2013-08-26 18:33:52.990974    a
Freq: S, dtype: object

In [12]: df.idxmin(1)
Out[12]: 
2013-08-26 18:33:48.990974    a
2013-08-26 18:33:49.990974    b
2013-08-26 18:33:50.990974    a
2013-08-26 18:33:51.990974    a
2013-08-26 18:33:52.990974    a
Freq: S, dtype: object