使用Python 3.5和Pandas 0.19.2
我描述了我的问题:我在数据框中有不同的“IDActivo”按日期和时间升序排序。好吧,我有一个名为Result的字段,其值为NaN或1.我需要为每一行计算多长时间之前的结果字段为1的特定“IdActivo”的最后N次。
这是我的数据框:
import pandas as pd
import numpy as np
from datetime import datetime
df = pd.DataFrame({'IdActivo': [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2],
'Fecha': ['1990-01-02','1990-01-03','1990-01-04','1990-01-05','1990-01-08',\
'1990-01-09','1990-01-10','1990-01-11','1990-01-12' ,'1990-01-15',\
'1990-01-16', '1990-01-17', '1990-01-18','1990-01-19','1990-01-22',\
'1990-01-23 ', '1990-01-24', '1990-01-25','1990-01-26','1990-01-29'],
'Hora': ['10:10:00','10:11:00','10:12:00','10:13:00','10:10:00',\
'10:10:00','10:17:00','10:14:00','11:14:00','12:14:00',\
'10:10:00', '10:20:00', '14:22:00','15:22:00','16:22:00',\
'10:10:00', '00:00:00', '00:00:00','00:00:00','00:00:00']})
def Inicio():
numHoraDia = '10:10:00'
numDia = 2 # para nosotros el 2 será el martes ya que le añadimos +1 al lunes que es 0 por defecto
nomDiasSemanaHora = " Resultado"; inpfield = "Fecha" ; oupfield = "Dia_Semana"
df_final = Fecha_Dia_Hora(df,inpfield,oupfield,numHoraDia,numDia,nomDiasSemanaHora)
print (df_final)
def Fecha_Dia_Hora(df, inpfield, oupfield,numHoraDia,numDia,nomDiasSemanaHora):
ord_df = df.sort_values(by=['IdActivo', 'Fecha'])
ord_df[inpfield] = pd.to_datetime(ord_df[inpfield])
ord_df[oupfield] = ord_df[inpfield].dt.dayofweek + 1
ord_df[nomDiasSemanaHora] = np.NaN
ord_df.ix[np.logical_and(ord_df[oupfield] == numDia, ord_df.Hora == numHoraDia), [nomDiasSemanaHora]] = '1'
return ord_df.sort_index()
def Fin():
print("FIN")
if __name__ == "__main__":
Inicio()
Fin()
我向您展示了您可以在代码中看到的数据框的派生示例:
我必须调查哪些功能才能获得它?
谢谢
天使