将多个日期列合并为一个

时间:2017-03-27 05:12:34

标签: r dataframe

我有一个数据框,其中包含几个日期为

的列
col1<-seq( as.Date("2011-07-01"), by=20, len=10)
col2<-seq( as.Date("2011-09-01"), by=7, len=10)
col3<-seq( as.Date("2011-08-01"), by=1, len=10)
data.frame(col1,col2,col3)

数据框如下所示:

         col1       col2       col3
1  2011-07-01 2011-09-01 2011-08-01
2  2011-07-21 2011-09-08 2011-08-02
3  2011-08-10 2011-09-15 2011-08-03
4  2011-08-30 2011-09-22 2011-08-04
5  2011-09-19 2011-09-29 2011-08-05
6  2011-10-09 2011-10-06 2011-08-06
7  2011-10-29 2011-10-13 2011-08-07
8  2011-11-18 2011-10-20 2011-08-08
9  2011-12-08 2011-10-27 2011-08-09
10 2011-12-28 2011-11-03 2011-08-10

我正在尝试将它们合并到一列中,以便

一个。每行只剩下最低(最早)的日期,而其他日期则被忽略

1  2011-07-01
2  2011-07-21
3  2011-08-03
4  2011-08-04
5  2011-08-05
6  2011-08-06
7  2011-08-07
8  2011-08-08
9  2011-08-09
10 2011-08-10

B中。每行只保留最高(最新)日期

1  2011-09-01
2  2011-09-08
3  2011-09-15
4  2011-09-22
5  2011-09-29
6  2011-10-09
7  2011-10-29
8  2011-11-18
9  2011-12-08
10 2011-12-28

真实数据集有NA s所以如果遇到NA,除非所有列都有特定行的缺失值,否则应该忽略它,在这种情况下将生成NA同样。

有什么想法吗?

2 个答案:

答案 0 :(得分:7)

pminpmax在这里很有用:

do.call(pmin, dat)
# [1] "2011-07-01" "2011-07-21" "2011-08-03" "2011-08-04" "2011-08-05"
# [6] "2011-08-06" "2011-08-07" "2011-08-08" "2011-08-09" "2011-08-10"
do.call(pmax, dat)
# [1] "2011-09-01" "2011-09-08" "2011-09-15" "2011-09-22" "2011-09-29"
# [6] "2011-10-09" "2011-10-29" "2011-11-18" "2011-12-08" "2011-12-28"

这也适用于NA值,例如:

do.call(pmin, c(dat, na.rm=TRUE) )

您还可以选择要分析的特定列,如:

do.call(pmin, c(dat[c("col1","col2","col3")], na.rm=TRUE) )

答案 1 :(得分:1)

我们可以使用max.col查找每行中最大值的索引,然后使用行索引查找cbind并获取每行的值,转换为data.frame

j1 <- sapply(df1, as.numeric)
df2 <- data.frame(Date = df1[cbind(1:nrow(df1),max.col(j1, 'first')  )])
df3 <- data.frame(Date = df1[cbind(1:nrow(df1), max.col(-1*j1, "first"))])
df2
#         Date
#1  2011-09-01
#2  2011-09-08
#3  2011-09-15
#4  2011-09-22
#5  2011-09-29
#6  2011-10-09
#7  2011-10-29
#8  2011-11-18
#9  2011-12-08
#10 2011-12-28

df3
#         Date
#1  2011-07-01
#2  2011-07-21
#3  2011-08-03
#4  2011-08-04
#5  2011-08-05
#6  2011-08-06
#7  2011-08-07
#8  2011-08-08
#9  2011-08-09
#10 2011-08-10

或另一种选择是

as.Date(apply(df1, 1, min, na.rm = TRUE))
as.Date(apply(df1, 1, max, na.rm = TRUE))

tidyverse

library(tidyverse)
df1 %>%
      rowwise() %>%
      transmute(col1Max = max(col1, col2, col3), colMin = min(col1, col2, col3))