是否有可能在Array ( [0] => 2020-11-21 [1] => 2020-11-22 [2] => 2020-11-23 ) Array ( [0] => 2020-10-11 [1] => 2020-10-12 [2] => 2020-10-13 [3] => 2020-10-14 [4] => 2020-10-15 )
之后将观测值“上移”并在列中的观测值上方移除pivot_wider()
?我尝试NA's
的列,但这似乎很麻烦。显然,我不受此方法的约束,但希望停留在lag()
中。
tidyverse
当前输出如下:
library(tidyverse)
set.seed(1111)
df <- data.frame(
item = as.numeric(sample(1:20)),
clust = as.numeric(sample(1:3, 20, replace = TRUE))
)
df %>%
arrange(clust, item) %>%
rowid_to_column() %>%
pivot_wider(names_from = clust, values_from = item, names_prefix = "Cluster_") %>%
select(-rowid)
所需的输出如下:
# A tibble: 20 x 3
Cluster_1 Cluster_2 Cluster_3
<dbl> <dbl> <dbl>
1 3 NA NA
2 13 NA NA
3 14 NA NA
4 15 NA NA
5 16 NA NA
6 17 NA NA
7 19 NA NA
8 20 NA NA
9 NA 1 NA
10 NA 4 NA
11 NA 6 NA
12 NA 7 NA
13 NA 8 NA
14 NA 9 NA
15 NA 12 NA
16 NA 18 NA
17 NA NA 2
18 NA NA 5
19 NA NA 10
20 NA NA 11
我知道,这种方法会危害数据集,但这只是出于美学原因,因为随后将小标题导出到LATEX文档中,并且仅有助于可视化群集分组。
答案 0 :(得分:1)
您可以像这样实现所需的输出:
library(tidyverse)
set.seed(1111)
df <- data.frame(
item = as.numeric(sample(1:20)),
clust = as.numeric(sample(1:3, 20, replace = TRUE))
)
df %>%
arrange(clust, item) %>%
group_by(clust) %>%
mutate(id =row_number()) %>%
pivot_wider(names_from = clust, values_from = item, names_prefix = "Cluster_") %>%
select(-id)
#> # A tibble: 8 x 3
#> Cluster_1 Cluster_2 Cluster_3
#> <dbl> <dbl> <dbl>
#> 1 3 1 2
#> 2 13 4 5
#> 3 14 6 10
#> 4 15 7 11
#> 5 16 8 NA
#> 6 17 9 NA
#> 7 19 12 NA
#> 8 20 18 NA
答案 1 :(得分:0)
这是一种使用split
并调整长度的方法。
s <- split(df$item, df$clust)
as.data.frame(lapply(s, function(x) `length<-`(sort(x), max(lengths(s)))))
# X1 X2 X3
# 1 3 1 2
# 2 13 4 5
# 3 14 6 10
# 4 15 7 11
# 5 16 8 NA
# 6 17 9 NA
# 7 19 12 NA
# 8 20 18 NA