根据条件或序列合并来自行的数据

时间:2016-12-10 01:24:33

标签: r merge conditional sequence

我有以下数据:

Data <- data.frame(Project=c(123,123,123,123,123,123,124,124,124,124,124,125,125,125,126,126),
                     Value=c(1,4,7,3,8,9,8,3,2,5,6,2,2,1,8,3),
                     OldValue=c("","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","","Open","In Progress","In Progress",""),
                     NewValue=c("Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","Complete","Open","In Progress","Complete","","In Progress"))

Data$First <- ifelse(((Data$OldValue==""|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="Open"),Data$Value,NA)
Data$Second <- ifelse(((Data$OldValue=="Open"|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="In Progress"),Data$Value,NA)
Data$Third <- ifelse(((Data$NewValue=="Complete"|Data$NewValue=="System Declined")&Data$OldValue=="In Progress"),Data$Value,NA)

enter image description here

对于每个唯一的项目ID,我想要结合第一,第二和第三个值成一行。如果NewValue列中的值遵循以下任一序列,我只想这样做:

打开,进行中,完成 要么 打开,进行中,系统拒绝

因此,Project 123将有两行数据,而Project 124&amp; 125会有一个。第10行和第11行将被排除,因为它不符合上述顺序

最简单的代码编码方法是什么?

2 个答案:

答案 0 :(得分:0)

使用#include <iostream> #include <functional> #include <unordered_map> template<class T> void foo( const std::function<void (std::unordered_map<int, T>&&)>& bar ) { std::unordered_map<int, T> myMap; bar(myMap); } int main() { foo([](auto&& m) { }); } 的解决方案:

dplyr

答案 1 :(得分:0)

这是实现目标的一种方式。我想创建一个数字序列,它们可以表示您指定的两种模式(即,Open-In Progress-Complete和Open-In Progress-System Declined)。出于这个原因,我使用fct_collapse()将因子级别折叠为三级。然后,我将新的因子水平转换为数字。然后,我想在每个Project中创建子组,这是我在第二个mutate()中完成的。下一个任务是更改FirstSecondThird中元素的顺序。你想把数字放在一行。所以我使用了sort()。应用此操作有一个条件,即identical(check[1:3], as.numeric(1:3))。如果你有这两种模式中的任何一种,你应该期望在check中有一个1,2,3的序列。您对每个组使用此逻辑检查。只要符合此逻辑条件,sort()就会应用于由Projectgroup定义的每个组中的三列。最后,我删除了checkgroup,我将其用于整个操作。

library(dplyr)
library(forcats)

Data %>%
mutate(check = as.numeric(
                   as.character(fct_collapse(NewValue, 
                                             `1` = "Open",
                                             `2` = "In Progress",
                                             `3` = c("Complete", "System Declined"))))) %>%
group_by(Project) %>%
mutate(group = cumsum(c(TRUE, diff(check) != 1))) %>%
group_by(Project, group) %>%
mutate_at(vars(First:Third),
          funs(if(identical(check[1:3], as.numeric(1:3))){
               sort(., na.last = TRUE)} else{.}
          )) %>%
select(-check, -group)

#   group Project Value        OldValue        NewValue First Second Third
#   <int>   <dbl> <dbl>          <fctr>          <fctr> <dbl>  <dbl> <dbl>
#1      1     123     1                            Open     1      4     7
#2      1     123     4            Open     In Progress    NA     NA    NA
#3      1     123     7     In Progress        Complete    NA     NA    NA
#4      2     123     3        Complete            Open     3      8     9
#5      2     123     8            Open     In Progress    NA     NA    NA
#6      2     123     9     In Progress        Complete    NA     NA    NA
#7      1     124     8        Complete            Open     8      3     2
#8      1     124     3            Open     In Progress    NA     NA    NA
#9      1     124     2     In Progress System Declined    NA     NA    NA
#10     2     124     5 System Declined     In Progress    NA      5    NA
#11     2     124     6     In Progress        Complete    NA     NA     6
#12     1     125     2                            Open     2      2     1
#13     1     125     2            Open     In Progress    NA     NA    NA
#14     1     125     1     In Progress        Complete    NA     NA    NA