Question

在Pandas 0.12中，如果您在具有自定义重新采样功能的DataFrame上使用重新采样方法，则会为每个数据帧行调用一次自定义函数，从而可以访问所有列中的值。在Pandas 0.15中，resample方法每个数据帧条目调用一次自定义函数，唯一可用的值是该条目（不是整行）。

如何恢复0.12行为并查看自定义函数中的整行？

这就是区别：

初始设置：

In [1]: import pandas

In [2]: import datetime

In [3]: import sys

In [4]: dt = datetime.datetime(2014,1,1)

In [5]: idx = [dt + datetime.timedelta(days=i) for i in [0,2]]

In [6]: df = pandas.DataFrame({'a': [1.0, 2.0], 'b': ['x', 'y']}, index=idx)

In [7]: foo = lambda data: sys.stdout.write("***\n" + str(data) + "\n")

0.12行为（注意有3次调用foo）：

In [8]: pandas.__version__
Out[8]: '0.12.0'

In [9]: df.resample(rule='D', how=foo, fill_method='ffill')
***
            a  b
2014-01-01  1  x
***
Empty DataFrame
Columns: [a, b]
Index: []
***
            a  b
2014-01-03  2  y
Out[9]: 
               a     b
2014-01-01  None  None
2014-01-02  None  None
2014-01-03  None  None

0.15行为（注意有6次调用foo）：

In [8]: pandas.__version__
Out[8]: '0.15.0'

In [9]: df.resample(rule='D', how=foo, fill_method='ffill')
***
2014-01-01    1
Name: a, dtype: float64
***
Series([], name: a, dtype: float64)
***
2014-01-03    2
Name: a, dtype: float64
***
2014-01-01    x
Name: b, dtype: object
***
Series([], name: b, dtype: object)
***
2014-01-03    y
Name: b, dtype: object
Out[9]: 
             a     b
2014-01-01 NaN  None
2014-01-02 NaN  None
2014-01-03 NaN  None

Answer 1

我不知道为什么行为发生了变化，但是考虑使用TimeGrouper和groupby可以让你回到原来的结果，虽然除非给foo返回值，否则会出错

In [496]: df.groupby(pd.TimeGrouper('D')).apply(foo)
***
            a  b
2014-01-01  1  x
***
Empty DataFrame
Columns: [a, b]
Index: []
***
            a  b
2014-01-03  2  y
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
.....
ValueError: All objects passed were None

如何在Pandas 0.15 DataFrame.resample方法中访问多个列？

1 个答案: