groupby.rolling.count()产生“非唯一多索引”异常

时间:2018-12-30 14:52:13

标签: pandas

我正在使用train_sample.csv here

请考虑以下内容:

import pandas as pd


df = pd.read_csv('train_sample.csv')
df = df.drop(['attributed_time'], axis=1)

df['click_time'] = pd.to_datetime(df['click_time'])
df = df.set_index('click_time')
df = df.sort_index()

df['clicks_last_hour'] = df.groupby(['ip']).rolling('1H').count()

我要在其中创建一个新列的地方,该列计算过去一个小时中某个ip clicked的次数。

我得到:

Traceback (most recent call last):
File "train_sample.py", line 11, in <module>
    df['clicks_last_hour'] = df.groupby(['ip']).rolling('1H').count()
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3119, in __setitem__
    self._set_item(key, value)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3194, in _set_item
    value = self._sanitize_column(key, value)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3378, in _sanitize_column
    value = reindexer(value).T
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3358, in reindexer
    raise e
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3353, in reindexer
    value = value.reindex(self.index)._values
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\util\_decorators.py", line 187, in wrapper
    return func(*args, **kwargs)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3566, in reindex
    return super(DataFrame, self).reindex(**kwargs)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\generic.py", line 3689, in reindex
    fill_value, copy).__finalize__(self)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3501, in _reindex_axes
    fill_value, limit, tolerance)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\frame.py", line 3509, in _reindex_index
    tolerance=tolerance)
File "C:\Users\galah\Miniconda3\envs\venv\lib\site-packages\pandas\core\indexes\multi.py", line 2068, in reindex
    raise Exception("cannot handle a non-unique multi-index!")
Exception: cannot handle a non-unique multi-index!

尽管从我检查的结果来看,没有基于相同的ipclick_time的重复项。

我在做什么错了?

0 个答案:

没有答案