熊猫将每日文本数据分组为带有日期索引的每周

时间:2019-11-23 14:42:06

标签: python

样本数据集:

    Date         Text
2018-01-01      Apple
2018-01-01      Banana
2018-01-06      Cat
2018-01-08      Dog
2018-01-09      Elephant

我希望我的数据集看起来像这样:

Date              Text
2018-01-01        Apple Banana Cat
2019-01-08        Dog Elephant

1 个答案:

答案 0 :(得分:1)

尝试使用groupby函数。 这是一个可能有帮助的示例代码

import pandas as pd
import numpy as np

df = pd.DataFrame({
        'Date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'], 
        'Sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww']
    })
print(df)

给予:

         Date   Sym
0  2015-05-08  aapl
1  2015-05-07  aapl
2  2015-05-06  aapl
3  2015-05-05  aapl
4  2015-05-08  aaww
5  2015-05-07  aaww
6  2015-05-06  aaww
7  2015-05-05  aaww

现在我们是否按日期分组

df = df.groupby(['Date']).apply(lambda x: list(np.unique(x)))
l = []
for i in df:
    i[1] =  i[1]+i[2]
    i.pop()
    l.append(i)

df = pd.DataFrame(l)
print(df)    

它给出输出:

            0         1
0  2015-05-05  aaplaaww
1  2015-05-06  aaplaaww
2  2015-05-07  aaplaaww
3  2015-05-08  aaplaaww