如何使用另一个pandas DataFrame进行分组?

时间:2016-12-06 02:11:44

标签: python pandas

这似乎很容易做到,但我无法使其发挥作用。这是一个简短的例子:

import pandas as pd
import numpy as np

# Creating a DataFrame with a timeseries index and random data in 5 columns:
ts = pd.date_range('2016-12-05', periods=180, freq='T')
df = pd.DataFrame(np.random.rand(len(ts), 5), columns=[0,1,2,3,4], index=ts)

# Creating a `grouper` DataFrame, because don't want to change columns in `df`:
grouper = pd.DataFrame({'doy' : df.index.dayofyear,
                        'hr'  : df.index.hour}, index=df.index)
grouper['pd'] = pd.cut(grouper['hr'], bins=range(0, 25, 4), right=False)
# And now, to `groupby` using our `grouper`:
df.groupby(grouper['doy'], grouper['pd'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
...
TypeError: 'Series' objects are mutable, thus they cannot be hashed
# Ok, try it another way:
df.groupby(grouper[['doy', 'pd']])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional
# Another way:
df.groupby((grouper['doy'], grouper['pd']))
# Or equivalently:
df.groupby([grouper['doy'], grouper['pd']])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
...
AttributeError: 'tuple' object has no attribute 'append'

如果有办法,我可以使用其他DataFrame对DataFrame进行分组吗?

工作,但没有解决方案:

请注意,这可行,但不是我要找的解决方案:

grouper['doy-pd'] = grouper[['doy', 'pd']].apply(lambda x: '{0}-{1}'.format(x['doy'],x['pd']), axis=1)
df.groupby(grouper['doy-pd'])

因为MultiIndex

之后没有groupby().apply()

注意

该问题仅针对pandas 0.19.0。升级到0.19.1后,以下内容有效:

df.groupby([grouper['doy'], grouper['pd']])

此外,它是working solution in 0.17.x

结束问题,因为它是特定于版本的错误。

0 个答案:

没有答案