我有以下数据:
Data <- data.frame(Project=c(123,123,123,123,123,123,123,123,124,124,124,124,124,125,125),
Value=c(1,2,3,4,7,3,8,9,8,3,2,5,6,2,3),
OldValue=c("","Open","In Progress","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","","Open"),
NewValue=c("Open","In Progress","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","Complete","Open","Complete"))
Data$Request <- ifelse(((Data$OldValue==""|Data$OldValue=="Complete"|Data$OldValue=="System Declined"|Data$OldValue=="In Progress")&Data$NewValue=="Open"),Data$Value,NA)
Data$Start <- ifelse(((Data$OldValue=="Open"|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="In Progress"),Data$Value,NA)
Data$End <- ifelse(((Data$NewValue=="Complete"|Data$NewValue=="System Declined")&(Data$OldValue=="Open"|Data$OldValue=="In Progress")),Data$Value,NA)
我希望能够通过唯一的项目ID确定已完成项目的数量及其关联值。已完成的项目是同时填充“请求”和“结束”字段的项目(“开始”不是必填字段)。
我想使用以下标准获取信息:
我只对捕获“请求”,“开始”和“结束”字段的第一个实例感兴趣。示例:前5行代码显示两个“请求”值,两个“开始”值和一个“结束”值。我想将每个字段的第一个实例合并为一行,因此结果将是1,2,7。
在上面的例子中,我希望1,2,7值在“NewValue”的第一个实例上合并为“Open”,而不是第二个,而不是两者。 / p>
每个项目ID可以有多个已完成的项目。示例:项目ID 123应该有两个已完成的项目:值1,2,7和3,8,9
这是我正在寻找的结果:
Result <- data.frame(Project=c(123,123,124,125),
Value=c(1,3,8,2),
OldValue=c("","Complete","Complete",""),
NewValue=c("Open","Open","Open","Open"),
Request=c(1,3,8,2),
Start=c(2,8,3,""),
End=c(7,9,2,3))
非常感谢任何编码帮助。
答案 0 :(得分:1)
目前尚不清楚OP究竟想要什么,但我希望您可以调整此代码以获得所需的结果
require(data.table)
Data <- data.table(Data)
Data2 <- Data[,.(OldValue, NewValue,Request = first(Request[!is.na(Request)]),
Start = first(Start[!is.na(Start)]),
End = first(End[!is.na(End)])),
.(Project, Value)]