我有一个大型数据文件,其中包含许多引用的不同日期和数量。每一行都是一个交易,具有日期和数量。我需要找出低于阈值的交易是否先于更大的交易(就数量而言)。我已经实现了这个目标,但却无法想到一个不太复杂的方法,我确信这个方法存在。我很欣赏任何提示。下面是一个完全可重现的例子:
# load required package
require(data.table)
# make it fully reproducible
set.seed(1)
a <- data.table(ref = sample(LETTERS[1:10], 300, TRUE), dates = sample(seq(as.Date("2017-08-01"), as.Date("2017-12-01"), "day"), 300, TRUE), qty = sample(1:500, 300, TRUE))
# Compute some intermediate tables
# First one has all records below the threshold (20) with their dates
temp1 <- a[, .(dates, qLess = qty < 20, qty), by = ref][qLess == TRUE,]
# Second one has all records above threshold with minimum dates
temp2 <- a[, .(qGeq = qty >= 20, dates), by = ref][qGeq == TRUE,][, min(dates), by = ref]
# Join both tables on ref, filter those below the threshold and filter the ones that are actually preceded (prec) by a larger order. THIS IS THE EXPECTED RESULT
temp1[temp2, on = "ref"][, prec := V1 < dates][qLess == TRUE,][prec == TRUE,]
预期结果将至少具有参考价值,而不是之前或之后,但最好是数量和日期(对于低于阈值的交易)和前一个日期(如提供的示例中所示)。
答案 0 :(得分:2)
仅使用 Browser("Edge").Page("Loan#").WebButton("LoanConditions").Click
Browser("Edge").Page("Loan#).GetROProperty("url")
Result = Browser("Edge").Page("Loan#").GetROProperty("url")
replace (Result,"abc123","xyz789")
Systemutil.Run "Chrome.exe", "Result"
的非等连接可能性的另一种方法:
data.table
给出:
setorder(a, ref, dates) a[qty < 20][a[qty >= 20] , on = .(ref, dates > dates) , prev.big.date := i.dates, by = .EACHI][]
答案 1 :(得分:0)
这非常简单。我们设置密钥按ref和日期排序,然后用1
标记“大”订单,为小订单设置大订单前面的NA
和大订单的日期,然后填写大订单向前约会。结果包含每个订单的最新大订单,如果没有先前的大订单,则为缺失值。
setkey(a, ref, dates)
a[, is_big := (qty >= 20) + 0L]
a[is_big == 1, preceding_big_date := dates]
a[, preceding_big_date := zoo::na.locf(preceding_big_date), by = ref]
new_result = a[is_big == 0, ]