我有一个如下所示的长帧文件:
df <- structure(list(Date =c("2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08",
"2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08",
"2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08"),
Part=c("A", "A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B", "B", "B", "B",
"C", "C", "C", "C", "C", "C", "C", "C", "C"),
method=c("Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4",
"Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4",
"Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4"),
value= c(4L, 46L, 43L, 9L, 8L, 46L, 63L, 84L, 2L, 5L, 78L, 2L, 89L, 2L, 6L, 62L, 25L, 46L, 3L, 4L, 7L, 24L, 13L, 21L, 19L, 8L, 3L)),
class= "data.frame", row.names=c(NA, -27L))
我想创建另一个名为BestMethod
的列。变量应该是与按类型和日期最接近类型3的值对应的方法列表。
例如,在2011-01的A部分中,类型1,2,3已应用,类型1最接近类型3.在BestMethod
下,我将使用Type1。否则如果没有应用所有3种类型,我会把NA。
(在excel中,它看起来像这样:
=INDEX(C2:F2, MATCH(MIN(ABS(C2:F2-B2)), ABS(C2:F2-B2),0))
然后这个:
=IF(B2="", "NA", INDEX($C$1:$F$1,1,(MATCH(H2,C2:F2,0)))))
然后我想创建另一个名为FinalMethod
的列。我想为所有日期复制每个部分列出最多的类型。
例如。在2011-01,2011-02中,对于A部分,类型1是更好的匹配,但在2011-03类型2是更好的匹配。在这种情况下,我希望类型1成为此部分所有日期的FinalMethod
。
我尝试了以下内容:
which(abs(x-your.number)==min(abs(x-your.number)))
但是每次调用正确的数据值并在每行中运行它都会遇到麻烦。
感谢。
期望的输出:
df <- structure(list(Date =c("2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08",
"2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08",
"2011-01", "2011-08", "2012-03", "2011-01", "2011-08", "2011-01", "2011-08", "2011-01", "2011-08"),
Part=c("A", "A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B", "B", "B", "B",
"C", "C", "C", "C", "C", "C", "C", "C", "C"),
method=c("Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4",
"Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4",
"Type1","Type1","Type1","Type2","Type2","Type3","Type3","Type4","Type4"),
value= c(4L, 46L, 43L, 9L, 8L, 46L, 63L, 84L, 2L, 5L, 78L, 2L, 89L, 2L, 6L, 62L, 25L, 46L, 3L, 4L, 7L, 24L, 13L, 21L, 19L, 8L, 3L),
BestModel=c("Type2", "Type1", "NA", "Type2", "Type1", "Type2", "Type1", "Type2", "Type1",
"Type1", "Type1Type4", "NA", "Type1", "Type1Type4", "Type1", "Type1Type4","Type1", "Type1Type4",
"Type2", "Type2", "NA", "Type2", "Type2", "Type2", "Type2", "Type2", "Type2"),
FinalModel= c("Type1Type2", "Type1Type2","Type1Type2", "Type1Type2","Type1Type2", "Type1Type2","Type1Type2","Type1Type2","Type1Type2",
"Type1", "Type1", "Type1", "Type1", "Type1", "Type1","Type1", "Type1", "Type1",
"Type2", "Type2","Type2", "Type2", "Type2", "Type2","Type2", "Type2", "Type2")),
class= "data.frame", row.names=c(NA, -27L))
答案 0 :(得分:1)
使用dplyr
+ tidyr
的不太优雅的解决方案,但有效:
library(dplyr)
library(tidyr)
temp = df %>%
group_by(Part, Date) %>%
mutate(value.x = ifelse(method == "Type3", value, NA)) %>%
fill(value.x, .direction = "up") %>%
fill(value.x) %>%
mutate(difference = abs(value.x - value)) %>%
filter(method != "Type3") %>%
filter(difference == min(difference))
BestMethod = temp %>%
summarize(BestMethod = paste(method, collapse = " "))
FinalMethod = temp %>%
group_by(Part, method) %>%
summarize(count = n()) %>%
filter(count == max(count)) %>%
rename(FinalMethod = method)
df %>%
full_join(BestMethod) %>%
full_join(FinalMethod) %>%
select(-count) %>%
arrange(Part, Date)
<强>结果:强>
Date Part method value BestMethod FinalMethod
1 2011-01 A Type1 4 Type2 Type1
2 2011-01 A Type1 4 Type2 Type2
3 2011-01 A Type2 9 Type2 Type1
4 2011-01 A Type2 9 Type2 Type2
5 2011-01 A Type3 46 Type2 Type1
6 2011-01 A Type3 46 Type2 Type2
7 2011-01 A Type4 84 Type2 Type1
8 2011-01 A Type4 84 Type2 Type2
9 2011-08 A Type1 46 Type1 Type1
10 2011-08 A Type1 46 Type1 Type2
11 2011-08 A Type2 8 Type1 Type1
12 2011-08 A Type2 8 Type1 Type2
13 2011-08 A Type3 63 Type1 Type1
14 2011-08 A Type3 63 Type1 Type2
15 2011-08 A Type4 2 Type1 Type1
16 2011-08 A Type4 2 Type1 Type2
17 2012-03 A Type1 43 <NA> Type1
18 2012-03 A Type1 43 <NA> Type2
19 2011-01 B Type1 5 Type1 Type1
20 2011-01 B Type2 89 Type1 Type1
21 2011-01 B Type3 6 Type1 Type1
22 2011-01 B Type4 25 Type1 Type1
23 2011-08 B Type1 78 Type1 Type4 Type1
24 2011-08 B Type2 2 Type1 Type4 Type1
25 2011-08 B Type3 62 Type1 Type4 Type1
26 2011-08 B Type4 46 Type1 Type4 Type1
27 2012-03 B Type1 2 <NA> Type1
28 2011-01 C Type1 3 Type2 Type2
29 2011-01 C Type2 24 Type2 Type2
30 2011-01 C Type3 21 Type2 Type2
31 2011-01 C Type4 8 Type2 Type2
32 2011-08 C Type1 4 Type2 Type2
33 2011-08 C Type2 13 Type2 Type2
34 2011-08 C Type3 19 Type2 Type2
35 2011-08 C Type4 3 Type2 Type2
36 2012-03 C Type1 7 <NA> Type2