这似乎很容易做到,但我无法使其发挥作用。这是一个简短的例子:
import pandas as pd
import numpy as np
# Creating a DataFrame with a timeseries index and random data in 5 columns:
ts = pd.date_range('2016-12-05', periods=180, freq='T')
df = pd.DataFrame(np.random.rand(len(ts), 5), columns=[0,1,2,3,4], index=ts)
# Creating a `grouper` DataFrame, because don't want to change columns in `df`:
grouper = pd.DataFrame({'doy' : df.index.dayofyear,
'hr' : df.index.hour}, index=df.index)
grouper['pd'] = pd.cut(grouper['hr'], bins=range(0, 25, 4), right=False)
# And now, to `groupby` using our `grouper`:
df.groupby(grouper['doy'], grouper['pd'])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
...
TypeError: 'Series' objects are mutable, thus they cannot be hashed
# Ok, try it another way:
df.groupby(grouper[['doy', 'pd']])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
...
ValueError: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional
# Another way:
df.groupby((grouper['doy'], grouper['pd']))
# Or equivalently:
df.groupby([grouper['doy'], grouper['pd']])
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
...
AttributeError: 'tuple' object has no attribute 'append'
如果有办法,我可以使用其他DataFrame对DataFrame进行分组吗?
请注意,这可行,但不是我要找的解决方案:
grouper['doy-pd'] = grouper[['doy', 'pd']].apply(lambda x: '{0}-{1}'.format(x['doy'],x['pd']), axis=1)
df.groupby(grouper['doy-pd'])
因为MultiIndex
groupby().apply()
该问题仅针对pandas 0.19.0
。升级到0.19.1
后,以下内容有效:
df.groupby([grouper['doy'], grouper['pd']])
此外,它是working solution in 0.17.x
。
结束问题,因为它是特定于版本的错误。