我在R
中有以下数据框Client_ID IT FMCG Consumer Oil_Gas Finance
ABC 0 2345 0 4768.90 0
CFG 234 0 0 2366.54 0
DEF 1234 0 345 523 2344
现在,我想要的是打印每个客户拥有的行业数量(均为非零值)。我可以通过跟随R。
来做到这一点 df$portfolio_holdings <- simplyfy2array(apply(df[2:6],1,function(x) paste(names(df[2:6])[x!=0],collapse=" ")))
这给了我以下输出。
Client_ID IT FMCG Consumer Oil_Gas Finance portfolio_holdings
ABC 0 2345 0 4768.90 0 FMCG Oil_Gas
CFG 234 0 0 2366.54 0 IT Oil_Gas
DEF 1234 0 345 523 2344 IT Consumer Oil_Gas Finance
我有另一个数据框,其中包含以下列
Sectors Scrip Target_Price Call
FMCG WER 345 Buy
IT CFHG 134 Sell
Oil_Gas ERTY 567 Buy
Consumer QWER 543 Buy
Finance QASD 334 Buy
现在,我想要的是推荐客户随机3个以上的部门,而这些部门并没有在他的投资组合中持有。最终所需的数据框架将是。
Client_ID IT FMCG Consumer Oil_Gas Finance portfolio Recommendation
ABC 0 2345 0 4768.90 0 FMCG Oil_Gas 1:IT|GFHG|134|Sell||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy
CFG 234 0 0 2366.54 0 IT Oil_Gas 1:FMCG|WER|345|Buy||2:Consumer|QWER|543|Buy||3:Finance|QASD|334|Buy
DEF 1234 0 345 523 2344 IT Consumer Oil_Gas Finance 1:FMCG|WER|345|Buy
我如何在R中实现这一目标?
示例数据框
client_id <- c('ABC','DEF','ERT')
IT <- c(0,234,1234)
FMCG <- c(2345,0,0)
Consumer <- c(0,0,345)
Oil_Gas <- c(4768,2366,523)
Finance <- c(0,0,2345)
Sectors <- c('FMCG','IT','Oil_Gas','Consumer','Finance')
Scrip <- c('ABC','DFG','ERT','QWE','VGB')
Target <- c(345,134,567,543,334)
call <- c('Buy','Sell','Buy','Buy','Buy')
recom <- data.frame(Sectors,Scrip,Target,call)
df <- data.frame(client_id,IT,FMCG,Consumer,Oil_Gas,Finance)
答案 0 :(得分:2)
我没有测试过这段代码,因为我没有这里的data.frames,但你可以理解。
假设第二个数据帧名为df2:
recomend=function(df1,df2){
df1$Recommendation=NA
for(i in 1:dim(df1)[1]){
recm=which(!df2$Sectors%in%unlist(strsplit(df1$portfolio_holdings[i]," ")))
recm=recm[sample(1:length(recm))[1:3]]
nval=c()
for(j in 1:length(recm)){
nval=c(nval,paste(df2[recm[j],],collapse="|"))
}
df1$Recommendation[i]=paste(nval,collapse="||")
}
return(df1)
}