我需要从包含事件列表的pandas数据框中获取7天日历视图。以下是日期的样本。
DatetimeIndex(['2017-05-15', '2017-05-12', '2017-05-07', '2017-05-15',
'2017-05-17', '2017-05-17', '2017-05-07', '2017-05-01',
'2017-05-07', '2017-05-04', '2017-05-02', '2017-05-01',
'2017-05-06', '2017-05-15', '2017-05-13', '2017-05-06',
'2017-05-03', '2017-04-21', '2017-04-10', '2017-04-10',
'2017-04-18', '2017-03-13', '2017-04-13', '2017-05-04',
'2017-03-16', '2017-05-01', '2017-04-15', '2017-04-01',
'2017-04-01', '2017-04-01'],
dtype='datetime64[ns]', name=u'Date', freq=None)
我需要将上面的数据帧设置为n x 7矩阵。其中n是周数。列是(星期一,星期二,星期三,星期四,星期五,星期六和星期日)。
由于缺少日期,我列出了所有可能的日期。
min_date = min(df['Date'])
max_date = max(df['Date'])
idx = pd.date_range(min_date, max_date)
DatetimeIndex(['2017-04-01', '2017-04-02', '2017-04-03', '2017-04-04',
'2017-04-05', '2017-04-06', '2017-04-07', '2017-04-08',
'2017-04-09', '2017-04-10', '2017-04-11', '2017-04-12',
'2017-04-13', '2017-04-14', '2017-04-15', '2017-04-16',
'2017-04-17', '2017-04-18', '2017-04-19', '2017-04-20',
'2017-04-21', '2017-04-22', '2017-04-23', '2017-04-24',
'2017-04-25', '2017-04-26', '2017-04-27', '2017-04-28',
'2017-04-29', '2017-04-30', '2017-05-01', '2017-05-02',
'2017-05-03', '2017-05-04', '2017-05-05', '2017-05-06',
'2017-05-07'],
dtype='datetime64[ns]', freq='D')
然后使用以下行,我已经知道日期将在实际矩阵中的哪一列
week = idx.dayofweek
>> array([5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6,
0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6])
是否存在将idx转换为n x 7矩阵的pythonic方法?这样我就可以检查原始数据框中的日期是否等于(i,j)处的日期,然后我可以填充矩阵。
答案 0 :(得分:2)
如果您修改idx
的初始创建,以确保它在周一和周一开始通过改变
idx = pd.date_range(min_date, max_date)
到
idx = pd.date_range(min_date-dt.timedelta(days=min_date.weekday()),
max_date+dt.timedelta(days=6-max_date.weekday()))
您可以使用np.reshape
将其重新排列为七列:
idx.values.reshape(len(idx)//7, 7)
如果需要,您可以将其转换回DataFrame。
使用您的示例,
date = pd.DatetimeIndex(['2017-05-15', '2017-05-12', '2017-05-07', '2017-05-15',
'2017-05-17', '2017-05-17', '2017-05-07', '2017-05-01',
'2017-05-07', '2017-05-04', '2017-05-02', '2017-05-01',
'2017-05-06', '2017-05-15', '2017-05-13', '2017-05-06',
'2017-05-03', '2017-04-21', '2017-04-10', '2017-04-10',
'2017-04-18', '2017-03-13', '2017-04-13', '2017-05-04',
'2017-03-16', '2017-05-01', '2017-04-15', '2017-04-01',
'2017-04-01', '2017-04-01'],
dtype='datetime64[ns]', name=u'Date', freq=None)
min_date = min(date)
max_date = max(date)
idx = pd.date_range(min_date-dt.timedelta(days=min_date.weekday()),
max_date+dt.timedelta(days=6-max_date.weekday()))
pd.DataFrame(idx.values.reshape(len(idx)//7, 7), columns=idx[:7].strftime('%A'))
Out[222]:
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
0 2017-03-13 2017-03-14 2017-03-15 2017-03-16 2017-03-17 2017-03-18 2017-03-19
1 2017-03-20 2017-03-21 2017-03-22 2017-03-23 2017-03-24 2017-03-25 2017-03-26
2 2017-03-27 2017-03-28 2017-03-29 2017-03-30 2017-03-31 2017-04-01 2017-04-02
3 2017-04-03 2017-04-04 2017-04-05 2017-04-06 2017-04-07 2017-04-08 2017-04-09
4 2017-04-10 2017-04-11 2017-04-12 2017-04-13 2017-04-14 2017-04-15 2017-04-16
5 2017-04-17 2017-04-18 2017-04-19 2017-04-20 2017-04-21 2017-04-22 2017-04-23
6 2017-04-24 2017-04-25 2017-04-26 2017-04-27 2017-04-28 2017-04-29 2017-04-30
7 2017-05-01 2017-05-02 2017-05-03 2017-05-04 2017-05-05 2017-05-06 2017-05-07
8 2017-05-08 2017-05-09 2017-05-10 2017-05-11 2017-05-12 2017-05-13 2017-05-14
9 2017-05-15 2017-05-16 2017-05-17 2017-05-18 2017-05-19 2017-05-20 2017-05-21