假设我在熊猫中有2个系列:
from datetime import datetime, timedelta
import pandas as pd
d = datetime.now()
index = [d + timedelta(seconds = i) for i in range(5)]
a = pd.Series([1,4,5,7,8], index = index)
b = pd.Series([2,3,6,7,8], index = index)
获取相应索引元素的最小/最大值的最佳方法是什么。 像:
min_func(a, b): [1,3,5,7,8] (for given index)
max_func(a, b): [2,4,6,7,8]
我在文档中找到的唯一函数是在系列中返回min / max的min / max函数,而.apply函数不接受index参数。 有没有更好的方法来实现,没有手动系列迭代或一些算术魔术(如min_func:a *(a = a))
由于
答案 0 :(得分:7)
将系列组合成一个自动与索引对齐的框架
In [51]: index
Out[51]:
[datetime.datetime(2013, 8, 26, 18, 33, 48, 990974),
datetime.datetime(2013, 8, 26, 18, 33, 49, 990974),
datetime.datetime(2013, 8, 26, 18, 33, 50, 990974),
datetime.datetime(2013, 8, 26, 18, 33, 51, 990974),
datetime.datetime(2013, 8, 26, 18, 33, 52, 990974)]
In [52]: a = pd.Series([1,4,5,7,8], index = index)
In [53]: b = pd.Series([2,3,6,7,8], index = index)
In [54]: a
Out[54]:
2013-08-26 18:33:48.990974 1
2013-08-26 18:33:49.990974 4
2013-08-26 18:33:50.990974 5
2013-08-26 18:33:51.990974 7
2013-08-26 18:33:52.990974 8
dtype: int64
In [55]: b
Out[55]:
2013-08-26 18:33:48.990974 2
2013-08-26 18:33:49.990974 3
2013-08-26 18:33:50.990974 6
2013-08-26 18:33:51.990974 7
2013-08-26 18:33:52.990974 8
dtype: int64
In [56]: df = DataFrame({ 'a' : a, 'b' : b })
In [57]: df
Out[57]:
a b
2013-08-26 18:33:48.990974 1 2
2013-08-26 18:33:49.990974 4 3
2013-08-26 18:33:50.990974 5 6
2013-08-26 18:33:51.990974 7 7
2013-08-26 18:33:52.990974 8 8
最小/最大
In [9]: df.max(1)
Out[9]:
2013-08-26 18:33:48.990974 2
2013-08-26 18:33:49.990974 4
2013-08-26 18:33:50.990974 6
2013-08-26 18:33:51.990974 7
2013-08-26 18:33:52.990974 8
Freq: S, dtype: int64
In [10]: df.min(1)
Out[10]:
2013-08-26 18:33:48.990974 1
2013-08-26 18:33:49.990974 3
2013-08-26 18:33:50.990974 5
2013-08-26 18:33:51.990974 7
2013-08-26 18:33:52.990974 8
Freq: S, dtype: int64
最小/最大指数
In [11]: df.idxmax(1)
Out[11]:
2013-08-26 18:33:48.990974 b
2013-08-26 18:33:49.990974 a
2013-08-26 18:33:50.990974 b
2013-08-26 18:33:51.990974 a
2013-08-26 18:33:52.990974 a
Freq: S, dtype: object
In [12]: df.idxmin(1)
Out[12]:
2013-08-26 18:33:48.990974 a
2013-08-26 18:33:49.990974 b
2013-08-26 18:33:50.990974 a
2013-08-26 18:33:51.990974 a
2013-08-26 18:33:52.990974 a
Freq: S, dtype: object