我有这样的数据框。
date X1 X2
1: 2001-12-31 96.32 NA
2: 2002-01-29 NA 100.7
3: 2002-01-31 96.59 NA
4: 2002-02-28 96.67 100.7
5: 2002-03-29 NA 100.7
6: 2002-03-31 97.36 NA
7: 2002-04-29 NA 87.3
8: 2002-04-30 97.72 NA
9: 2002-05-29 NA 87.3
10:2002-05-31 97.60 NA
我有一些具有不同日期的值,我想将它们与月末对齐,因此希望使用X1作为“基数”并将X2值与月末对齐,如X1中所示。最终产品将是没有NA和匹配日期的干净数据框架。
预期产出:
date X1 X2
1: 2001-12-31 96.32 NA
2: 2002-01-31 96.59 100.7
3: 2002-02-28 96.67 100.7
4: 2002-03-31 97.36 100.7
5: 2002-04-30 97.72 87.3
6: 2002-05-31 97.60 87.3
df <- structure(list(date = structure(c(11687L, 11716L, 11718L, 11746L,
11775L, 11777L, 11806L, 11807L, 11836L, 11838L), class = "Date"),
X1 = c(96.32, NA, 96.59, 96.67, NA, 97.36, NA, 97.72, NA,
97.6), X2 = c(NA, 100.7, NA, 100.7, 100.7, NA, 87.3, NA,
87.3, NA)), .Names = c("date", "X1", "X2"), row.names = c(NA,
10L), class = "data.frame")
答案 0 :(得分:2)
我们可以使用data.table
尝试以下内容。
library(data.table)
setDT(df)[,month := month(date)][,lapply(.SD, max, na.rm = TRUE), by = month]
# month date X1 X2
#1: 12 2001-12-31 96.32 -Inf
#2: 1 2002-01-31 96.59 100.7
#3: 2 2002-02-28 96.67 100.7
#4: 3 2002-03-31 97.36 100.7
#5: 4 2002-04-30 97.72 87.3
#6: 5 2002-05-31 97.60 87.3
为了分组目的而创建了一个新的变量month
(以及保留原始的date
列),如果之后不需要,您可以随时删除它。