我有一张订单表和一个帐户活动表。我希望找到该订单帐户中的最新活动。我想迭代每个订单,找到与帐户匹配的活动,以及最接近订单日期的日期。
rec type date account nearest.rec
1 Order 12/1/2016 A
2 Order 11/14/2016 B
3 Activity 11/13/2016 A
4 Activity 10/15/2016 C
5 Order 11/13/2016 C
6 Activity 11/16/2016 A
7 Activity 11/17/2016 A
8 Activity 10/14/2016 B
9 Activity 11/4/2016 B
想把它变成这个:
rec type date account nearest.rec.actv
1 Order 12/1/2016 A 7
2 Order 11/14/2016 B 9
3 Activity 11/13/2016 A
4 Activity 10/15/2016 C
5 Order 11/13/2016 C 4
6 Activity 11/16/2016 A
7 Activity 11/17/2016 A
8 Activity 10/14/2016 B
9 Activity 11/4/2016 B
或转变为自己的数据框
rec type date account nearest.rec.actv actv.date
1 Order 12/1/2016 A 7 11/17/2016
2 Order 11/14/2016 B 9 11/4/2016
5 Order 11/13/2016 C 4 10/15/2016
答案 0 :(得分:1)
按type
拆分数据,然后按account
合并,然后汇总
df$date <- as.Date(df$date, "%m/%d/%Y")
ind <- df$type=="Order"
df1 <- df[ind,]
df2 <- df[!ind,]
left_join(df1, df2, by="account") %>%
group_by(account) %>%
filter( date.x - date.y == min(date.x-date.y))
# rec.x type.x date.x account rec.y type.y date.y
# <int> <chr> <date> <chr> <int> <chr> <date>
#1 1 Order 2016-12-01 A 7 Activity 2016-11-17
#2 2 Order 2016-11-14 B 9 Activity 2016-11-04
#3 5 Order 2016-11-13 C 4 Activity 2016-10-15
答案 1 :(得分:0)
这不是一个有效的答案,但逐步执行可能会有所帮助:
# subset into 2 dataframes
df1 <- df[df$type == "Order",]
df2 <- df[df$type == "Activity",]
# basic logic in the mutate() is that get the time difference for each record in a account. find the minimum, and get the corresponding activity date and record
df1 %>% group_by(account) %>%
mutate(x = df2$date[account==df2$account][which(min(difftime(date, df2$date[account == df2$account])) == difftime(date, df2$date[account == df2$account]))],
y = df2$rec[account==df2$account][which(min(difftime(date, df2$date[account == df2$account])) == difftime(date, df2$date[account == df2$account]))])
# rec type date account x y
# <int> <chr> <date> <chr> <date> <int>
#1 1 Order 2016-12-01 A 2016-11-17 7
#2 2 Order 2016-11-14 B 2016-11-04 9
#3 5 Order 2016-11-13 C 2016-10-15 4
答案 2 :(得分:0)
这是我使用DECLARE
@ID INTEGER
SELECT
@ID = 2 --enter ID you are looking for here
IF EXISTS
(SELECT TOP(1) ID, FieldName FROM MyTable WHERE ID = @ID)
BEGIN
SELECT
ID,
FieldName
FROM
MyTable
WHERE ID = @ID
END
ELSE
BEGIN
SELECT UserNote = 'No records match your search.'
END
和dplyr
的解决方案:
purrr