变量出现在我的数据框中的第一个和最后一个日期

时间:2019-04-16 15:07:22

标签: python pandas numpy date datetime

我想让变量的第一个和最后一个日期出现在我的数据框中:


   datetime           A
2019-03-04 00:03      1
2019-03-04 00:04      1
2019-03-04 00:05      2
2019-03-04 00:06      2 
2019-03-04 00:07      1
2019-03-04 00:08      2
2019-03-04 00:09      3
2019-03-04 00:10      3
2019-03-04 00:11      3
2019-03-04 00:12      4
2019-03-04 00:13      3

所需的输出:

A            First                     Last
1      2019-03-04 00:03          2019-03-04 00:07
2      2019-03-04 00:05          2019-03-04 00:08
3      2019-03-04 00:09          2019-03-04 00:13
4      2019-03-04 00:12          2019-03-04 00:12

我尝试过这个:

data_df=pd.Series({x : y.datetime.tolist() for x , y in df.groupby('A')})
data_df=pd.DataFrame({'A':data_df.index, 'datetime':data_df.values})
data_df

我有这个输出

A                              datetime
1       [2019-03-04 00:03,2019-03-04 00:04,2019-03-04 00:07]
2       [2019-03-04 00:05,2019-03-04 00:06,2019-03-04 00:08]
3       [2019-03-04 00:09,2019-03-04 00:10,2019-03-04 00:11,2019-03-04 00:13]
4       [2019-03-04 00:12]

1 个答案:

答案 0 :(得分:1)

使用inputPanel(selectInput('x', label = 'x axis:', choices = colnames(mtcars)[1:7], selected = 'mpg'), selectInput('y', label = 'y axis:', choices = colnames(mtcars)[1:7], selected = 'cyl'), selectInput('z', label = 'z axis:', choices = colnames(mtcars)[1:7], selected = 'hp')) renderPlot({ plot3d(mtcars[, input$x], mtcars[,input$y], mtcars[,input$z], size = 4, xlab = paste('Feat. ', input$x, sep = ''), ylab = paste('Feat. ', input$y, sep = ''), zlab = paste('Feat. ', input$z, sep = ''), type = 'p', col = rainbow(3) ) rglwidget() }) 并在agg对象上传递函数列表:

groupby

如果需要,您可以致电In[108]: df.groupby('A').agg(['first','last']) Out[108]: datetime first last A 1 2019-03-04 00:03:00 2019-03-04 00:07:00 2 2019-03-04 00:05:00 2019-03-04 00:08:00 3 2019-03-04 00:09:00 2019-03-04 00:13:00 4 2019-03-04 00:12:00 2019-03-04 00:12:00

reset_index

这将为每个组调用firstlast

更新 感谢@ Wen-Ben的建议,如果我们选择单列,则不会在df中创建多索引:

In[109]:
df.groupby('A').agg(['first','last']).reset_index()

Out[109]: 
   A            datetime                    
                   first                last
0  1 2019-03-04 00:03:00 2019-03-04 00:07:00
1  2 2019-03-04 00:05:00 2019-03-04 00:08:00
2  3 2019-03-04 00:09:00 2019-03-04 00:13:00
3  4 2019-03-04 00:12:00 2019-03-04 00:12:00

匹配您期望的输出