使用base :: reshape重新整形可以减少带警告的行数

时间:2015-09-04 15:43:04

标签: r reshape

我有一个数据框如下

parklotrate_df <- data.frame( spaceNum=c(1,1,23,23,24,24,25,25,24,24), type=c("EMP", "EMP", "VIP", "VIP", "VIP", "VIP","RESV", "RESV", "VIP", "VIP"), parkinout=c("IN","OUT", "IN", "OUT", "IN", "OUT", "IN", "OUT", "IN", "OUT"), time=c("2015/9/4 19:43", "2015/9/4 21:16", "2015/9/4 19:39","2015/9/4 21:11","2015/9/3 08:00","2015/9/3 19:40","2015/9/3 23:00", "2015/9/4 19:51","2015/9/3 21:01","2015/9/3 22:01"))

> parklotrate_df
   spaceNum type parkinout           time
1         1  EMP        IN 2015/9/4 19:43
2         1  EMP       OUT 2015/9/4 21:16
3        23  VIP        IN 2015/9/4 19:39
4        23  VIP       OUT 2015/9/4 21:11
5        24  VIP        IN 2015/9/3 08:00
6        24  VIP       OUT 2015/9/3 19:40
7        25 RESV        IN 2015/9/3 23:00
8        25 RESV       OUT 2015/9/4 19:51
9        24  VIP        IN 2015/9/3 21:01
10       24  VIP       OUT 2015/9/3 22:01

我使用以下命令将数据帧重新整形为宽格式, 我得到一些警告信息。

reshape(parklotrate_df, idvar=c("spaceNum","type"), timevar="parkinout", direction="wide")
  spaceNum type        time.IN       time.OUT
1        1  EMP 2015/9/4 19:43 2015/9/4 21:16
3       23  VIP 2015/9/4 19:39 2015/9/4 21:11
5       24  VIP 2015/9/3 08:00 2015/9/3 19:40
7       25 RESV 2015/9/3 23:00 2015/9/4 19:51
Warning messages:
1: In reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying,  :
  multiple rows match for parkinout=IN: first taken
2: In reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying,  :
  multiple rows match for parkinout=OUT: first taken
> 

输出缺失记录9,10为VIP24 IN时间2015/9/3 21:01 OUT 2015/09/03 22:01

我希望得到

      spaceNum type        time.IN       time.OUT
1        1  EMP 2015/9/4 19:43 2015/9/4 21:16
3       23  VIP 2015/9/4 19:39 2015/9/4 21:11
5       24  VIP 2015/9/3 08:00 2015/9/3 19:40
7       25 RESV 2015/9/3 23:00 2015/9/4 19:51
9       24  VIP 2015/9/3 21:01 2015/9/3 22:01

我可以做什么命令来获得这个结果?

1 个答案:

答案 0 :(得分:1)

好的,考虑到评论和OP问题的编辑,这应该有效。

df <- with(parklotrate_df,parklotrate_df[order(spaceNum,time),])
df <- df[-4,]   # remove one of the "OUT" for the sake of the demo...
df$id <- cumsum(df$parkinout=="IN")
reshape(df, idvar=c("id","spaceNum","type"), timevar="parkinout", direction="wide")[,-3]
#   spaceNum type        time.IN       time.OUT
# 1        1  EMP 2015/9/4 19:43 2015/9/4 21:16
# 3       23  VIP 2015/9/4 19:39           <NA>
# 5       24  VIP 2015/9/3 08:00 2015/9/3 19:40
# 9       24  VIP 2015/9/3 21:01 2015/9/3 22:01
# 7       25 RESV 2015/9/3 23:00 2015/9/4 19:51