考虑我有两个数据表的示例,df1
是我订单的副本,SOH是我的iventory。我想将df1$price
合并到SOH
,其中:
如果SOH$arrival_year > df1$year
,则写下与最早年份相关的价格,如果没有出现年份,则写入NA
如果SOH
项目未显示在df1
中,请在价格中写入NA
supplier <- c(1,1,1,1,1,2,2)
item <- c(20,20,20,21,22,23,26)
year <- c(2000,2002,2008,2001,2007,2005,2009)
price <- c(.3,.4,.5,1.6,1.5,3.2,.25)
df1 <- data.frame(supplier, item, year, price)
#
supplier_on_hand <- c(1,1,1,1,1,1,2,2,3)
item_on_hand <- c(20,20,20,22,20,20,23,23,10)
arrival_year <- c(2000,2001,2002,2009,2007,2012,2006,2004,2009)
SOH <- data.frame(supplier_on_hand, item_on_hand, arrival_year)
需要以下输出:
答案 0 :(得分:2)
另一种可能性是使用data.table
- 包的滚动连接功能:
library(data.table)
setDT(df1)[setDT(SOH), on = .(supplier = supplier_on_hand, item = item_on_hand, year = arrival_year), roll = Inf]
# in a bit more readable format:
setDT(SOH)
setDT(df1)
df1[SOH, on = .(supplier = supplier_on_hand, item = item_on_hand, year = arrival_year), roll = Inf]
# or with setting keys first:
setDT(SOH, key = c('supplier_on_hand','item_on_hand','arrival_year'))
setDT(df1, key = c('supplier','item','year'))
df1[SOH, roll = Inf]
给出:
supplier item year price
1: 1 20 2000 0.3
2: 1 20 2001 0.3
3: 1 20 2002 0.4
4: 1 20 2007 0.4
5: 1 20 2012 0.5
6: 1 22 2009 1.5
7: 2 23 2004 NA
8: 2 23 2006 3.2
9: 3 10 2009 NA
答案 1 :(得分:1)
以下看起来对我有用:
cbind(SOH, price =
apply(SOH, 1, function(x) {
#setting the item and year constraints
temp <- df1[df1$item == x[2] & df1$year <= x[3], ]
#order per year descending as per rules
temp <- temp[order(temp$year, decreasing = TRUE), ]
#set to NA if item or year does not confirm rules
if (is.na(temp[1, 'price'])) return(NA) else return(temp[1, 'price'])
})
)
输出继电器:
supplier_on_hand item_on_hand arrival_year price
1 1 20 2000 0.3
2 1 20 2001 0.3
3 1 20 2002 0.4
4 1 22 2009 1.5
5 1 20 2007 0.4
6 1 20 2012 0.5
7 2 23 2006 3.2
8 2 23 2004 NA
9 3 10 2009 NA