Question

如果我在下面有这个事件数据框df_e：

|------|------------|-------|
| group| event date | count |
| x123 | 2016-01-06 | 1     |
|      | 2016-01-08 | 10    |
|      | 2016-02-15 | 9     |
|      | 2016-05-22 | 6     |
|      | 2016-05-29 | 2     |
|      | 2016-05-31 | 6     |
|      | 2016-12-29 | 1     |
| x124 | 2016-01-01 | 1     |
...

并且还知道t0这是时间的开始（让我们说x123 2016-01-01}和tN从另一个数据框df_s（2017-05-25）开始实验，那么我该如何创建应该这样的数据框df_new

|------|------------|---------------|--------|
| group| obs. weekly| lifetime, week| status |
| x123 | 2016-01-01 | 1             | 1      |
|      | 2016-01-08 | 0             | 0      |
|      | 2016-01-15 | 0             | 0      |
|      | 2016-01-22 | 1             | 1      |
|      | 2016-01-29 | 2             | 1      |
...
|      | 2017-05-18 | 1             | 1      |
|      | 2017-05-25 | 1             | 1      |
...
| x124 | 2017-05-18 | 1             | 1      |
| x124 | 2017-05-25 | 1             | 1      |

解释：点击t0并生成行，直到{<1}}每周期间。对于每一行tN，如果事件日期在R范围内，则使用该group进行搜索，如果为True，则计算它在那里存放的周数，并将R设为生存，否则将此status = 1的{{1}}列设置为0，例如死。

问题：

1）如何根据给定lifetime, status和R值的group生成数据帧，例如为t0行生成[tN]列？

2）如何完成上述group, obs. weekly, lifetime, status数据帧的构建？

到目前为止我可以从这开始=）

(tN - t0) / week

Answer 1

我不确定我是否能得到你，但第一组与第二组无关，对吧？如果是这样的话，我认为你想要的是这样的：

import pandas as pd

df_group1 = df_group1.set_index('event date') 
df_group1.index = pd.to_datetime(df_group1.index) #convert the index to datetime so you can 'resample'
df_group1['lifetime, week'] = df_group1.resample('1W').apply(lamda x: yourfuncion(x))
df_group1 = df_group1.reset_index()
df_group1['status']= df_group1.apply(lambda x: 1 if x['lifetime, week']>0 else 0)

#do the same with group2 and concat to create df_all

我不知道你是如何得到的，一周，一周＆＃39;但剩下的就是创建生成它的函数。

熊猫：需要为每个事件发生的每周搜索创建数据帧

1 个答案: