我暴露了我的问题。我有这3个数据帧:
single
Mat Price
A 1029.90
B 568.52
C 497.12
D 573.50
E 217.92
double
Mat1 Mat2 Price
A C 1529.61
A D 1623.49
A E 1325.86
B C 1050.64
B D 1146.65
B E 849.02
C D 999.42
C E 700.03
D E 776.41
triple
Mat1 Mat2 Mat3 Price
B C D 1564.98
B C E 1267.30
C D E 1350.45
B D E 1202.33
使用这3个数据帧,我必须构建另一个数据帧(或列表),它给出了包含从A到E的一次且仅包含字母的所有可能组合。
例如,一些可能的组合可以是:A | B | C | D | E |取自数据框single
,或取自数据框double
和B |的AC D | E取自数据框single
,或取自数据框triple
的BCD和取自数据框double
的e AE等等,用于所有组合。
此外,对于发现的每个组合,我都希望将数据框的一列专用于计算的价格:
(材料的价格总和)+ 500 *(组合中使用的分组数)。
采用前面的例子:组合A | B | C | D | E |在数据框single
中获取所有内容的价格等于
(1029.90 + 568.52 + 497.12 + 573.50 + 217.92)+ 500 *(5)
组合AC | B | D | E的价格等于 (1529.61 + 568.52 + 573.50 + 217.92)+ 500 *(4)
AE | BCD组合的价格等于 (1325.86 + 1564.98)+ 500 *(2)
是否可以在R中创建一个自动执行此过程的函数?非常感谢
答案 0 :(得分:0)
这是我可能过于复杂的答案。不知道您的不同数据帧实际上有多大差异,我这样做是为了能够处理double
和triple
中可能出现的所有其他组合,但可能会有更简化的方法我没有想到的。我认为困难的部分是使用crossing
然后进行一些不同的filter
和distinct
调用,将其归结为您想要的29种组合。之后,生成正确的数字只需gather
和summarise
。我left_join
回到原件上,这样你就可以看到每个结果的内容。如果您知道这29种组合将始终相同且硬编码,则更简单。
library(tidyverse)
single <- read_table2("Mat Price
A 1029.90
B 568.52
C 497.12
D 573.50
E 217.92")
double <- read_table2("Mat1 Mat2 Price
A C 1529.61
A D 1623.49
A E 1325.86
B C 1050.64
B D 1146.65
B E 849.02
C D 999.42
C E 700.03
D E 776.41")
triple <- read_table2("Mat1 Mat2 Mat3 Price
B C D 1564.98
B C E 1267.30
C D E 1350.45
B D E 1202.33")
# Create vectors of letter groups
ones <- single$Mat
twos <- str_c(double$Mat1, double$Mat2)
threes <- str_c(triple$Mat1, triple$Mat2, triple$Mat3)
# Create vector of permutations of ABCDE
options <- ones %>%
gtools::permutations(5, 5, .) %>%
as_tibble() %>%
unite("option", V1:V5, sep = "") %>%
`[[`(1)
# Create table with prices to join combinations onto
lookup <- bind_rows(
single %>% mutate(rowcomb = Mat),
double %>% unite(rowcomb, Mat1:Mat2, remove = FALSE, sep = ""),
triple %>% unite(rowcomb, Mat1:Mat3, remove = FALSE, sep = "")
) %>%
select(rowcomb, price = Price)
# Get all possible combinations of letter groups that could make five letters
combinations <- bind_rows(
crossing(ones, ones, ones, ones, ones),
crossing(ones, ones, ones, twos),
crossing(ones, ones, threes),
crossing(ones, twos, twos),
crossing(twos, threes)
) %>%
mutate_all(~ replace_na(., "")) %>%
unite("string", ones:twos1, sep = "", remove = FALSE) %>%
filter(string %in% options) %>% # Remove any that have repeated letters
# Add column with the number of elements in each combination
bind_cols(groupings = pmap_int(.[, -1], function(...) sum(c(...) != ""))) %>%
bind_cols( # Add column that lets us remove different permutations of the same element combinations
row = .[, 2:9] %>%
pmap(function(...) c(...)) %>%
map_chr(~str_c(str_sort(.), collapse = "_"))
) %>%
distinct(row, .keep_all = TRUE) %>%
rowid_to_column(var = "comb_id") # 29 final combinations
total_price <- combinations %>%
gather("table", "letter", ones:twos1) %>%
left_join(lookup, by = c("letter" = "rowcomb")) %>%
group_by(comb_id) %>%
summarise(total_price = sum(price, na.rm = TRUE))
output <- left_join(combinations, total_price, by = "comb_id") %>%
mutate(end_price = total_price + 500 * groupings) %>%
select(comb_id, row, groupings, end_price)
output
#> # A tibble: 29 x 4
#> comb_id row groupings end_price
#> <int> <chr> <int> <dbl>
#> 1 1 ___A_B_C_D_E 5 5387.
#> 2 2 ____A_B_C_DE 4 4872.
#> 3 3 ____A_B_CE_D 4 4872.
#> 4 4 ____A_B_CD_E 4 4816.
#> 5 5 ____A_BE_C_D 4 4950.
#> 6 6 ____A_BD_C_E 4 4892.
#> 7 7 ____A_BC_D_E 4 4872.
#> 8 8 ____AE_B_C_D 4 4965.
#> 9 9 ____AD_B_C_E 4 4907.
#> 10 10 ____AC_B_D_E 4 4890.
#> # ... with 19 more rows
由reprex package(v0.2.0)创建于2018-04-09。