我正在分析我公司对原材料的需求,我采取的方法是使用成品的销售记录与每件成品的材料清单相结合。我现在面临的问题是每个成品都包含多个组件,许多成品共享共同的组件。我试图保留每个成品的所有单独销售记录,并使用UnitsSold乘以每个组件的单位数量以获得原材料的需求。以下是样本数据集的代码:
fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2),
Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6),
FG_UnitsSold=c(100,200,300,400,500,600))
bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4),
Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"),
Qty=rnorm(3,1,n = 12))%>%
arrange(FG_PartNumber)
我熟悉dplyr中的left_join,但它似乎不起作用,因为它总是会给我每个成品的第一个组件。
有人可以帮忙吗? 感谢。
答案 0 :(得分:0)
也许我不理解这个问题,但是如果你按照FG_PartNumber
对两个数据框进行分组并在你感兴趣的数量上制作一个数据透视表,你可以得到你想要的总数:
#Create data
set.seed(1)
fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2),
Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6),
FG_UnitsSold=c(100,200,300,400,500,600))
bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4),
Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"),
Qty=rnorm(3,1,n = 12))%>%
arrange(FG_PartNumber)
library(dplyr)
#make pivot tables for sales and quantity
tot_sales <- fg_Sales %>%
group_by(FG_PartNumber) %>%
summarise(tot_sales = sum(FG_UnitsSold))
tot_materials <- bill_materials %>%
group_by(FG_PartNumber) %>%
summarise(tot_qty = sum(Qty))
#join the pivot tables together
df <- left_join(tot_sales, tot_materials)
> df
# A tibble: 3 × 3
FG_PartNumber tot_sales tot_qty
<chr> <dbl> <dbl>
1 A 500 13.15087
2 B 700 14.76326
3 C 900 11.30953
答案 1 :(得分:0)
我认为来自mv ls *.{mp3,exe,mp4} /full/path/to/b
的{{1}}是最佳选择:
inner_join
来自dplyr
文档:&#34;如果x和y之间存在多个匹配项,则会返回所有匹配项的组合。&#34;
使用library(dplyr)
fg_Sales_ext <- inner_join(x = fg_Sales,
y = bill_materials,
by = "FG_PartNumber")
,您现在可以使用inner_join
和fg_Sales_ext
执行任何类型的分析。