Question

我在pandas中有一个数据集，其中包含唯一的事件键，人员键，日期和其他各种列。我正在尝试添加一个新列，该列将在该行上的日期之前为该人提供事件计数。我一直在搜索，但我只找到设定标准的结果（即df [＆＃39; x＆＃39;] = df [df [＆＃39; date＆＃39;]＆lt;＆＃2018-06-01＆＃39;]其中日期不会随着每一行进行动态变化）或者用于需要很长时间的.apply（函数）方法。

我正在考虑将df放入sqlite数据库，然后将表连接到自身，然后计算不同的case语句。以下示例。但是，我需要进行额外的操作，我认为必须有一种更快的方法在python中执行此操作。有什么建议吗？

Sample data in df- dates repeat and not in order. Multiple people can be on one date and a person can have multiple events on a single date.
[Event, person, date]
[1,1,2018-01-03]
[2,1,2018-01-01]
[3,1,2018-01-02]
[4,2,2018-01-04]
[5,2,2018-01-05]

Desired output

[Event, person, date, count of evnt]
[1,1,2018-01-03,    2]
[2,1,2018-01-01,    0]
[3,1,2018-01-02,    1]
[4,2,2018-01-04,    0]
[5,2,2018-01-05,    1]

抱歉格式化，我在手机上。

示例：

说字段是evebt，人，日期我会

Select event, 
             person, 
             date,  
             Count (distinct (case when ((t2.date less than
t1.date) And (t2.person=t1.person))
 Then t2.event else null end)) event_count

From t1

Left outer join t1 as t2 on (t2.event=t1.event)

Group by event, person, date.

熊猫countif与条件

0 个答案: