我有以下数据:
Data <- data.frame(Project=c(123,123,123,123,123,123,124,124,124,124,124,125,125,125,126,126),
Value=c(1,4,7,3,8,9,8,3,2,5,6,2,2,1,8,3),
OldValue=c("","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","","Open","In Progress","In Progress",""),
NewValue=c("Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","Complete","Open","In Progress","Complete","","In Progress"))
Data$First <- ifelse(((Data$OldValue==""|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="Open"),Data$Value,NA)
Data$Second <- ifelse(((Data$OldValue=="Open"|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="In Progress"),Data$Value,NA)
Data$Third <- ifelse(((Data$NewValue=="Complete"|Data$NewValue=="System Declined")&Data$OldValue=="In Progress"),Data$Value,NA)
对于每个唯一的项目ID,我想要结合第一,第二和第三个值成一行。如果NewValue列中的值遵循以下任一序列,我只想这样做:
打开,进行中,完成 要么 打开,进行中,系统拒绝
因此,Project 123将有两行数据,而Project 124&amp; 125会有一个。第10行和第11行将被排除,因为它不符合上述顺序
最简单的代码编码方法是什么?
答案 0 :(得分:0)
使用#include <iostream>
#include <functional>
#include <unordered_map>
template<class T>
void foo(
const std::function<void (std::unordered_map<int, T>&&)>& bar
) {
std::unordered_map<int, T> myMap;
bar(myMap);
}
int main() {
foo([](auto&& m) {
});
}
的解决方案:
dplyr
答案 1 :(得分:0)
这是实现目标的一种方式。我想创建一个数字序列,它们可以表示您指定的两种模式(即,Open-In Progress-Complete和Open-In Progress-System Declined)。出于这个原因,我使用fct_collapse()
将因子级别折叠为三级。然后,我将新的因子水平转换为数字。然后,我想在每个Project
中创建子组,这是我在第二个mutate()
中完成的。下一个任务是更改First
,Second
和Third
中元素的顺序。你想把数字放在一行。所以我使用了sort()
。应用此操作有一个条件,即identical(check[1:3], as.numeric(1:3))
。如果你有这两种模式中的任何一种,你应该期望在check
中有一个1,2,3的序列。您对每个组使用此逻辑检查。只要符合此逻辑条件,sort()
就会应用于由Project
和group
定义的每个组中的三列。最后,我删除了check
和group
,我将其用于整个操作。
library(dplyr)
library(forcats)
Data %>%
mutate(check = as.numeric(
as.character(fct_collapse(NewValue,
`1` = "Open",
`2` = "In Progress",
`3` = c("Complete", "System Declined"))))) %>%
group_by(Project) %>%
mutate(group = cumsum(c(TRUE, diff(check) != 1))) %>%
group_by(Project, group) %>%
mutate_at(vars(First:Third),
funs(if(identical(check[1:3], as.numeric(1:3))){
sort(., na.last = TRUE)} else{.}
)) %>%
select(-check, -group)
# group Project Value OldValue NewValue First Second Third
# <int> <dbl> <dbl> <fctr> <fctr> <dbl> <dbl> <dbl>
#1 1 123 1 Open 1 4 7
#2 1 123 4 Open In Progress NA NA NA
#3 1 123 7 In Progress Complete NA NA NA
#4 2 123 3 Complete Open 3 8 9
#5 2 123 8 Open In Progress NA NA NA
#6 2 123 9 In Progress Complete NA NA NA
#7 1 124 8 Complete Open 8 3 2
#8 1 124 3 Open In Progress NA NA NA
#9 1 124 2 In Progress System Declined NA NA NA
#10 2 124 5 System Declined In Progress NA 5 NA
#11 2 124 6 In Progress Complete NA NA 6
#12 1 125 2 Open 2 2 1
#13 1 125 2 Open In Progress NA NA NA
#14 1 125 1 In Progress Complete NA NA NA