事实证明,很难找到此类问题的搜索字词。 我需要编写一个脚本,以使数据框中每一行的字符串全部组合。它应该使用每个字符串一次,并且仅使字符串与第一个字符串相距两个步骤。实际上,第一列和最后一列彼此相邻。因此,它们也无法组合(实际上是一串弦)。需要将此相同的脚本应用于具有不同偶数列的数据帧,这里以8为例。
我只设法手动处理了给定列数的数据框,但没有一个表达式可以用于任何列数的数据框。
这是数据类型:
Crop_1 Crop_2 Crop_3 Crop_4 Crop_5 Crop_6 Crop_7 Crop_8
1 Potato Onion Sugarbeet Grassclover Cabbage Potato Wheat Carrot
2 Potato Sugarbeet Grassclover Potato Cabbage Onion Carrot Wheat
在这种情况下,理想的结果应该是以下6种选择:
Pair_1 Pair_2 Pair_3 Pair_4 Crop_1 Crop_2 Crop_3 Crop_4 Crop_5 Crop_6 Crop_7 Crop_8
1 Potato-Sugarbeet Onion-Grassclover Cabbage-Wheat Potato-Carrot Potato Onion Sugarbeet Grassclover Cabbage Potato Wheat Carrot
2 Potato-Grassclover Sugarbeet-Potato Cabbage-Carrot Onion-Wheat Potato Sugarbeet Grassclover Potato Cabbage Onion Carrot Wheat
3 Potato-Wheat Onion-Carrot Sugarbeet-Cabbage Grassclover-Potato Potato Onion Sugarbeet Grassclover Cabbage Potato Wheat Carrot
4 Potato-Carrot Sugarbeet-Wheat Grassclover-Cabbage Potato-Onion Potato Sugarbeet Grassclover Potato Cabbage Onion Carrot Wheat
5 Potato-Cabbage Onion-Potato Sugarbeet-Wheat Grassclover-Carrot Potato Onion Sugarbeet Grassclover Cabbage Potato Wheat Carrot
6 Potato-Cabbage Sugarbeet-Onion Grassclover-Carrot Potato-Wheat Potato Sugarbeet Grassclover Potato Cabbage Onion Carrot Wheat
可在此处检索数据框:
structure(list(Crop_1 = structure(c(1L, 1L), .Label = "Potato", class = "factor"),
Crop_2 = structure(1:2, .Label = c("Onion", "Sugarbeet"), class = "factor"),
Crop_3 = structure(2:1, .Label = c("Grassclover", "Sugarbeet"
), class = "factor"), Crop_4 = structure(1:2, .Label = c("Grassclover",
"Potato"), class = "factor"), Crop_5 = structure(c(1L, 1L
), .Label = "Cabbage", class = "factor"), Crop_6 = structure(2:1, .Label = c("Onion",
"Potato"), class = "factor"), Crop_7 = structure(2:1, .Label = c("Carrot",
"Wheat"), class = "factor"), Crop_8 = structure(1:2, .Label = c("Carrot",
"Wheat"), class = "factor")), class = "data.frame", row.names = c(NA,
-2L))
答案 0 :(得分:0)
这里有个功能可以解决问题。您需要处理的是偶数被四整除的数,而不是不能被四整除的数。对于可被四整除的那些,您可以将它们分组为四,然后按照完成的方法进行两对处理。我们使用seq.int
来获取每对的起点,然后使用setdiff
来获取终点。对于那些不是的人,请特别对待前6个(匹配1-4、2-5、3-6),然后像对待四个一样对待其余的人。
剩下的复杂性只是确保您可以接受tibble
并返回tibble
,因为这正是nest
和unnest
的期望。
library(tidyverse)
tbl <- structure(list(Crop_1 = c("Potato", "Potato"), Crop_2 = c("Onion", "Sugarbeet"), Crop_3 = c("Sugarbeet", "Grassclover"), Crop_4 = c("Grassclover", "Potato"), Crop_5 = c("Cabbage", "Cabbage"), Crop_6 = c("Potato", "Onion"), Crop_7 = c("Wheat", "Carrot"), Crop_8 = c("Carrot", "Wheat")), class = "data.frame", row.names = c(NA, -2L))
pair_crops <- function(crop_row) {
crop_set <- as.character(crop_row)
n_crops <- length(crop_set)
if (n_crops %% 2 == 1) {
stop("Odd number of crops!")
} else if (n_crops %% 4 == 0) {
starts <- sort(c(seq.int(1, n_crops, 4), seq.int(2, n_crops, 4)))
} else {
starts <- sort(c(1:3,seq.int(7, n_crops, 4), seq.int(8, n_crops, 4)))
}
ends <- setdiff(1:n_crops, starts)
tibble(
pair = str_c(crop_set[starts], "-", crop_set[ends]),
name = str_c("Pair_", 1:length(starts))
) %>%
spread(name, pair)
}
tbl %>%
rowid_to_column %>%
nest(-rowid, .key = "crop") %>%
mutate(pairs = map(crop, pair_crops)) %>%
unnest()
#> rowid Crop_1 Crop_2 Crop_3 Crop_4 Crop_5 Crop_6 Crop_7
#> 1 1 Potato Onion Sugarbeet Grassclover Cabbage Potato Wheat
#> 2 2 Potato Sugarbeet Grassclover Potato Cabbage Onion Carrot
#> Crop_8 Pair_1 Pair_2 Pair_3 Pair_4
#> 1 Carrot Potato-Sugarbeet Onion-Grassclover Cabbage-Wheat Potato-Carrot
#> 2 Wheat Potato-Grassclover Sugarbeet-Potato Cabbage-Carrot Onion-Wheat
由reprex package(v0.2.1)于2019-04-19创建