PORT STATUS VESSEL DWT IMP/EXP QTY (Mts)
1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
2 KAKINADA EXPECTED CELON BREEZE IMP 30000
3 KAKINADA BERTH CELON BREEZE IMP 3000
4 KAKINADA SAILED CELON BREEZE IMP 30000
5 KANDLA ANCHORAGE CAPTAIN HAMADA EXP 4500
6 KAKINADA BERTH CELON BREEZE IMP 30000
我想将一行(PORT,VESSEL,IMP / EXP)与另一行进行比较,如果匹配则删除,如果行中的IMP / EXP是" IMP"然后按STATUS的优先顺序删除行: 航行>泊位> “锚地”预期 它将优先考虑航行=状态,其他具有锚定和删除第二行,因为它匹配数量,端口,船只与第4行。 等条件匹配后再看
1 ) status=sailed and other have berth ,it will delete berth row
2) sailed and other have expected,it will delete expected row
3)if some row have berth and other have anchorage will delete anchorage
4)if some has expected=STATUS & other row have sailed=STATUS it will delete
"expected"=STATUS row
等等
行应该匹配条件,即qty,port,vessel,根据条件删除行
对于IMP / EXP中的EXP它应该匹配条件,即数量,端口,容器
状态优先条件:
priority- sailed>anchorage>expected> berth
OUTPUT应为
PORT STATUS VESSEL DWT IMP/EXP QTY (Mts)
1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
3 KAKINADA BERTH CELON BREEZE IMP 3000
4 KAKINADA SAILED CELON BREEZE IMP 30000
删除第2,第5,第6行是所需的输出
答案 0 :(得分:1)
首先,您需要在data.frame中将数据读入R中。 data.frame test
应如下所示:
>test
# PORT STATUS VESSEL DWT IMPEXP QTY
#1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
#2 KAKINADA EXPECTED CELON BREEZE NA IMP 30000
#3 KAKINADA BERTH CELON BREEZE NA IMP 3000
#4 KAKINADA SAILED CELON BREEZE NA IMP 30000
#5 KANDLA ANCHORAGE CAPTAIN HAMADA NA EXP 4500
#6 KAKINADA BERTH CELON BREEZE NA IMP 30000
使用plyr
包的ddply
功能,您应该可以在tfollowing功能的帮助下获得所需的输出。
ddply(test,.variables = c("PORT","VESSEL","IMPEXP","QTY"),
function(t){if(t$IMPEXP[1]=="IMP"){
t$STATUS<-factor(x = t$STATUS,levels =c("EXPECTED","ANCHORAGE","BERTH","SAILED"),ordered = T)
return(t[which.max(as.integer(t$STATUS)),])
}else{
t$STATUS<-factor(x = t$STATUS,levels =c("BERTH","EXPECTED","ANCHORAGE","SAILED"),ordered = T)
return(t[which.max(as.integer(t$STATUS)),])}
}
)
#PORT STATUS VESSEL DWT IMPEXP QTY
#1 KAKINADA BERTH CELON BREEZE NA IMP 3000
#2 KAKINADA SAILED CELON BREEZE NA IMP 30000
#3 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500