我是R的新手,我有以下方式获取数据 这些是2列
Broker_ID_Buy Broker_ID_Sell
638 423
546 728
423 321
546 423
并且它继续存在大约28个不同的经纪人出现在不同时间的买卖位置
我需要像这样安排数据
Broker_ID_638 Broker_ID_423 Broker_ID_546
BP SP IP
IP IP BP
IP BP IP
IP SP BP
BP =买入仓位,SP =卖出仓位,IP =闲置仓位
我想使用这三种不同的状态来预测使用马尔可夫链
答案 0 :(得分:2)
这似乎让你进入正确的球场:
library(reshape2)
x <- data.frame(BP = c(638,546,423,546), SP = c(423, 728, 321, 423))
x$index <- 1:nrow(x)
x.m <- melt(x, id.vars = "index")
out <- dcast(index ~ value, data = x.m, value.var="variable")
out[is.na(out)] <- "IP"
out
#---
index 321 423 546 638 728
1 1 IP SP IP BP IP
2 2 IP IP BP IP SP
3 3 SP BP IP IP IP
4 4 IP SP BP IP IP
答案 1 :(得分:1)
这是另一种可能的解决方案:
# create your table
txt <-
"Broker_ID_Buy,Broker_ID_Sell
638,423
546,728
423,321
546,423"
dt1 <- read.csv(text=txt)
# turn "Time, Broker_ID_Buy, Broker_ID_Sell" data.frame
# into "Time, Broker_ID, Position"
buyers <- data.frame(Time=1:nrow(dt1),
Broker_ID=dt1$Broker_ID_Buy,
Position="BP",
stringsAsFactors=F)
sellers <- data.frame(Time=1:nrow(dt1),
Broker_ID=dt1$Broker_ID_Sell,
Position="SP",
stringsAsFactors=F)
longDT <- rbind(buyers,sellers)
# pivot the brocker ids on the columns
wideDT <- reshape(data=longDT,direction="wide",
timevar="Broker_ID", idvar="Time", v.names="Position")
# well-format column names and turn NAs into "IP"
names(wideDT) <- sub(x=names(wideDT),pattern="Position.","Broker_ID_")
wideDT[is.na(wideDT)] <- "IP"
结果:
> wideDT
Time Broker_ID_638 Broker_ID_546 Broker_ID_423 Broker_ID_728 Broker_ID_321
1 1 BP IP SP IP IP
2 2 IP BP IP SP IP
3 3 IP IP BP IP SP
4 4 IP BP SP IP IP