我有一个数据帧news_df
,其中包含文章标题和日期,并且我希望将同一天在同一行撰写的文章归为一组。
name
date
2019-01-17 14:41:00 Forte hausse de l'indice Philly Fed en janvier
2019-01-17 14:36:00 Baisse des inscriptions hebdomadaires au chômage
2019-01-16 22:30:00 Wall Street finit en hausse, Goldman Sachs et ...
2019-01-16 16:14:00 Wall Street, soutenue par les résultats de ban...
2019-01-16 14:36:00 Baisse de 1% des prix à l'import en décembre
...
我尝试过:
news_df.resample('D', on='name')
但是它给我一个TypeError:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-275-bdfd57eadc21> in <module>
----> 1 news_df.resample('D', on='name')
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in resample(self, rule, how, axis, fill_method, closed, label, convention, kind, loffset, limit, base, on, level)
7108 axis=axis, kind=kind, loffset=loffset,
7109 convention=convention,
-> 7110 base=base, key=on, level=level)
7111 return _maybe_process_deprecations(r,
7112 how=how,
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\resample.py in resample(obj, kind, **kwds)
1146 """ create a TimeGrouper and return our resampler """
1147 tg = TimeGrouper(**kwds)
-> 1148 return tg._get_resampler(obj, kind=kind)
1149
1150
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\resample.py in _get_resampler(self, obj, kind)
1274 raise TypeError("Only valid with DatetimeIndex, "
1275 "TimedeltaIndex or PeriodIndex, "
-> 1276 "but got an instance of %r" % type(ax).__name__)
1277
1278 def _get_grouper(self, obj, validate=True):
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'
答案 0 :(得分:0)
# get date out of the index to column
df = df.reset_index()
# optional
df['date'] = pd.to_datetime(df['date'])
# groupby and output group rows as list
df = df.groupby('date')['name'].apply(list)
编辑:
您需要将strptime格式设置为输入日期的任何格式。
df['date'] = df['date'].apply(lambda x: dt.datetime.strptime(x, "%d/%m/%Y %H%M%S").strftime('%d/%m/%Y'))