Question

我是R的新手，我有以下方式获取数据这些是2列

Broker_ID_Buy            Broker_ID_Sell

  638                       423
  546                       728
  423                       321
  546                       423

并且它继续存在大约28个不同的经纪人出现在不同时间的买卖位置

我需要像这样安排数据

Broker_ID_638           Broker_ID_423         Broker_ID_546

BP                         SP                   IP
IP                         IP                   BP
IP                         BP                   IP
IP                         SP                   BP

BP =买入仓位，SP =卖出仓位，IP =闲置仓位

我想使用这三种不同的状态来预测使用马尔可夫链

Answer 1

这似乎让你进入正确的球场：

library(reshape2)
x <- data.frame(BP = c(638,546,423,546), SP = c(423, 728, 321, 423))
x$index <- 1:nrow(x)
x.m <- melt(x, id.vars = "index")
out <- dcast(index ~ value, data = x.m, value.var="variable")
out[is.na(out)] <- "IP"
out
    #---
  index 321 423 546 638 728
1     1  IP  SP  IP  BP  IP
2     2  IP  IP  BP  IP  SP
3     3  SP  BP  IP  IP  IP
4     4  IP  SP  BP  IP  IP

Answer 2

这是另一种可能的解决方案：

# create your table
txt <-
"Broker_ID_Buy,Broker_ID_Sell
638,423
546,728
423,321
546,423"
dt1 <- read.csv(text=txt)

# turn "Time, Broker_ID_Buy, Broker_ID_Sell" data.frame 
# into "Time, Broker_ID, Position"
buyers  <- data.frame(Time=1:nrow(dt1),
                      Broker_ID=dt1$Broker_ID_Buy,
                      Position="BP",
                      stringsAsFactors=F)
sellers <- data.frame(Time=1:nrow(dt1),
                      Broker_ID=dt1$Broker_ID_Sell,
                      Position="SP",
                      stringsAsFactors=F)           
longDT <- rbind(buyers,sellers)

# pivot the brocker ids on the columns 
wideDT <- reshape(data=longDT,direction="wide",
                  timevar="Broker_ID", idvar="Time", v.names="Position")

# well-format column names and turn NAs into "IP"
names(wideDT) <- sub(x=names(wideDT),pattern="Position.","Broker_ID_")
wideDT[is.na(wideDT)] <- "IP"

结果：

> wideDT
  Time Broker_ID_638 Broker_ID_546 Broker_ID_423 Broker_ID_728 Broker_ID_321
1    1            BP            IP            SP            IP            IP
2    2            IP            BP            IP            SP            IP
3    3            IP            IP            BP            IP            SP
4    4            IP            BP            SP            IP            IP

将值分组到R中的单独列中

2 个答案: