下面的数据捕获每个Adv(Adv_Code)的每月OPN(最佳产品编号)。 Change_Dt捕获Adv将状态从A更改为B的月份。
在更改月份之前,所有OPN都属于adv的A状态,在月份之后,所有OPN都属于B的状态。
下面是现有数据
Adv_Code Change_Dt April_OPN May_OPN June_OPN July_OPN Aug_OPN Sep_OPN Oct_OPN Nov_OPN Dec_OPN Jan_OPN Feb_OPN March_OPN
A201 April 0 0 0 0 0 0 0 0 0 0 0 0
A198 July 2 0 0 1 2 0 5 0 0 0 0 0
S1212 Nov 0 3 4 0 0 3 0 1 0 0 0 0
我想通过转换为长格式并基于OPN月创建Adv_Status来创建以下数据结构。也就是说,如果Month_OPN为 Month_OPN就是从4月到3月的12个月。 预期输出: 有人可以帮我在R中做到这一点吗?
OPN会捕获每个Adv的每月OPN。因此,这是每个Adv的April NOP到Mar NOP列中的值的转置。Agent_Code Change_Dt Month_OPN Adv_Status OPN
S1198201 April April B 0
S1198201 April May B 0
S1198201 April June B 0
S1198201 April July B 0
S1198201 April Aug B 0
S1198201 April Sep B 0
S1198201 April Oct B 0
S1198201 April Nov B 0
S1198201 April Dec B 0
S1198201 April Jan B 0
S1198201 April Feb B 0
S1198201 April Mar B 0
S1198203 July April A 2
S1198203 July May A 0
S1198203 July June A 0
S1198203 July July B 1
S1198203 July Aug B 2
S1198203 July Sep B 0
S1198203 July Oct B 5
S1198203 July Nov B 0
S1198203 July Dec B 0
S1198203 July Jan B 0
S1198203 July Feb B 0
S1198203 July Mar B 0
S1198212 Nov April A 0
S1198212 Nov May A 3
S1198212 Nov June A 4
S1198212 Nov July A 0
S1198212 Nov Aug A 0
S1198212 Nov Sep A 3
S1198212 Nov Oct A 0
S1198212 Nov Nov B 1
S1198212 Nov Dec B 0
S1198212 Nov Jan B 0
S1198212 Nov Feb B 0
S1198212 Nov Mar B 0
答案 0 :(得分:1)
考虑使用内置常量 month.name 和 month.abb 进行清理和月数计算的基数R reshape
:
# RESHAPE
rdf <- reshape(df, idvar=c("Adv_Code", "Change_Dt"),
varying=list(names(df)[-1][-1]), v.names="OPN",
times=names(df)[-1][-1], timevar="Month_OPN",
new.row.names=1:1E5, direction="long")
# CALCULATION
final_df <- within(rdf, {
# RETRIEVE MONTH NUMBER FROM MONTH NAME/MONTH ABBREV (e.g., JULY or JUL => 7)
Change_Dt_Num <- sapply(Change_Dt, function(x) max(which(month.name==x), which(month.abb==x)))
# REMOVE THE "_OPN" SUFFIX FROM Month_OPN VALUES
Month_OPN <- sub("_OPN", "", Month_OPN)
# RETRIEVE MONTH NUMBER FROM MONTH NAME/MONTH ABBREV (e.g., JULY or JUL => 7)
Month_OPN_Num <- sapply(Month_OPN, function(x) max(which(month.name==x), which(month.abb==x)))
# CONDITIONALLY ASSIGN "A" AND "B" BY COMPARING BOTH MONTH NUMBERS BEFORE/AFTER APRIL
Adv_Status <- ifelse(Month_OPN_Num < Change_Dt_Num & Month_OPN_Num >= 4, "A",
ifelse(Month_OPN_Num < Change_Dt_Num & Month_OPN_Num < 4, "B", "B"))
# REMOVE HELPER COLUMNS (USED FOR ABOVE CALCULATION ONLY)
rm(Change_Dt_Num, Month_OPN_Num)
})
# RE-ORDER ROWS AND RESET ROW NAMES
final_df <- with(final_df, final_df[order(Adv_Code),])
row.names(final_df) <- NULL
输出
final_df
# Adv_Code Change_Dt Month_OPN OPN Adv_Status
# 1 A198 July April 2 A
# 2 A198 July May 0 A
# 3 A198 July June 0 A
# 4 A198 July July 1 B
# 5 A198 July Aug 2 B
# 6 A198 July Sep 0 B
# 7 A198 July Oct 5 B
# 8 A198 July Nov 0 B
# 9 A198 July Dec 0 B
# 10 A198 July Jan 0 B
# 11 A198 July Feb 0 B
# 12 A198 July March 0 B
# 13 A201 April April 0 B
# 14 A201 April May 0 B
# 15 A201 April June 0 B
# 16 A201 April July 0 B
# 17 A201 April Aug 0 B
# 18 A201 April Sep 0 B
# 19 A201 April Oct 0 B
# 20 A201 April Nov 0 B
# 21 A201 April Dec 0 B
# 22 A201 April Jan 0 B
# 23 A201 April Feb 0 B
# 24 A201 April March 0 B
# 25 S1212 Nov April 0 A
# 26 S1212 Nov May 3 A
# 27 S1212 Nov June 4 A
# 28 S1212 Nov July 0 A
# 29 S1212 Nov Aug 0 A
# 30 S1212 Nov Sep 3 A
# 31 S1212 Nov Oct 0 A
# 32 S1212 Nov Nov 1 B
# 33 S1212 Nov Dec 0 B
# 34 S1212 Nov Jan 0 B
# 35 S1212 Nov Feb 0 B
# 36 S1212 Nov March 0 B