我的数据集类似于:
网站示例日期:
A A1 2016-09-01
A A1 2016-09-21
A A2 2016-09-15
A A2 2016-09-21
B B1 2016-09-03
B B2 2016-09-12
我想做的是expand.grid,但只能在df $ Site的每个级别内实现:
网站示例日期:
A A1 2016-09-01
A A1 2016-09-15
A A1 2016-09-21
A A2 2016-09-01
A A2 2016-09-15
A A2 2016-09-21
B B1 2016-09-03
B B1 2016-09-12
B B2 2016-09-03
B B2 2016-09-12
但我不知道如何使用expand.grid指定,所以我不会最终:
网站示例日期:
A A1 2016-09-01
A A1 2016-09-03
A A1 2016-09-12
A A1 2016-09-15
A A1 2016-09-21
A A2 2016-09-01
A A2 2016-09-03
A A2 2016-09-12
A A2 2016-09-15
A A2 2016-09-21
B B1 2016-09-01
B B1 2016-09-03
B B1 2016-09-12
B B1 2016-09-15
B B1 2016-09-21
B B2 2016-09-01
B B2 2016-09-03
B B2 2016-09-12
B B2 2016-09-15
B B2 2016-09-21
我希望这很清楚,我无法弄清楚如何很好地格式化这些表格!
答案 0 :(得分:1)
我们可以在使用'dplyr / tidyr'
对'Site'进行分组后执行此操作library(dplyr)
library(tidyr)
df1 %>%
group_by(Site) %>%
expand(Sample, Date)
# Site Sample Date
# <chr> <chr> <chr>
#1 A A1 2016-09-01
#2 A A1 2016-09-15
#3 A A1 2016-09-21
#4 A A2 2016-09-01
#5 A A2 2016-09-15
#6 A A2 2016-09-21
#7 B B1 2016-09-03
#8 B B1 2016-09-12
#9 B B2 2016-09-03
#10 B B2 2016-09-12
或使用data.table
library(data.table)
setDT(df1)[, do.call(CJ, lapply(.SD, unique)) , by = Site]
# Site Sample Date
# 1: A A1 2016-09-01
# 2: A A1 2016-09-15
# 3: A A1 2016-09-21
# 4: A A2 2016-09-01
# 5: A A2 2016-09-15
# 6: A A2 2016-09-21
# 7: B B1 2016-09-03
# 8: B B1 2016-09-12
# 9: B B2 2016-09-03
#10: B B2 2016-09-12
或者我们可以使用base R
解决方案
do.call(rbind, lapply(split(df1[-1], df1$Site),
function(x) expand.grid(lapply(x, unique))))
# Sample Date
#A.1 A1 2016-09-01
#A.2 A2 2016-09-01
#A.3 A1 2016-09-21
#A.4 A2 2016-09-21
#A.5 A1 2016-09-15
#A.6 A2 2016-09-15
#B.1 B1 2016-09-03
#B.2 B2 2016-09-03
#B.3 B1 2016-09-12
#B.4 B2 2016-09-12
df1 <- structure(list(Site = c("A", "A", "A", "A", "B", "B"), Sample = c("A1",
"A1", "A2", "A2", "B1", "B2"), Date = c("2016-09-01", "2016-09-21",
"2016-09-15", "2016-09-21", "2016-09-03", "2016-09-12")), .Names = c("Site",
"Sample", "Date"), class = "data.frame", row.names = c(NA, -6L))
答案 1 :(得分:0)
这是基础R解决方案。您可以提供expand.grid
这样的唯一向量
do.call(rbind, lapply(split(df, df$Site),
function(i) with(i, expand.grid(unique(Site), unique(Sample), unique(Date)))))
Var1 Var2 Var3
A.1 A A1 2016-09-01
A.2 A A2 2016-09-01
A.3 A A1 2016-09-21
A.4 A A2 2016-09-21
A.5 A A1 2016-09-15
A.6 A A2 2016-09-15
B.1 B B1 2016-09-03
B.2 B B2 2016-09-03
B.3 B B1 2016-09-12
B.4 B B2 2016-09-12
或在每个展开的data.frame上使用unique
。
do.call(rbind, lapply(split(df, df$Site),
function(i) with(i, unique(expand.grid(Site, Sample, Date)))))
Var1 Var2 Var3
A.1 A A1 2016-09-01
A.9 A A2 2016-09-01
A.17 A A1 2016-09-21
A.25 A A2 2016-09-21
A.33 A A1 2016-09-15
A.41 A A2 2016-09-15
B.1 B B1 2016-09-03
B.3 B B2 2016-09-03
B.5 B B1 2016-09-12
B.7 B B2 2016-09-12