seqecreate按字母顺序重新排列事件

时间:2019-03-29 15:47:43

标签: r traminer

我正在使用Traminer seqecreate函数创建事件序列。但是,在创建的事件序列对象中,同时发生的事件将按字母顺序重新排序。

数据按发生的事件顺序排序,但是在创建事件序列对象时,将同时发生的事件按字母顺序重新排序。

我可以手动组合同时发生的事件,但是只是想问一下如何确保seqecreate不会对事件重新排序

library("dplyr")
library("TraMineR")
# DATA
eventDat <- data.frame(id = c(rep(1,4), rep(2,10), rep(3,12)),
                        timeframe = c(rep(0,3),1,rep(0,3),rep(458,3),rep(558,2),559,560,
                                      rep(0,3),8,rep(48,3),57,169,170,511,546),
                        event = c("I01,I02,I03,I17,I05,I16","T222,T511,T30,T12","noProc",
                                  "apcdischarge","I01","T222,T221,T53","aedischarge",
                                  "I03,I05,I06","T222,T511,T30,T17","aedischarge",
                                  "I01,I02,I03,I05,I16,I14,I17,I07,I06",
                                  "T222,T516,T291,T30","M472","apcdischarge",
                                  "I01,I02,I05,I03","T12,T25,T30,T222,T291",
                                  "noProc","apcdischarge","I01,I02,I05,I03,I17",
                                  "T222,T221,T511,T30","noProc","apcdischarge",
                                  "noProc","apcdischarge","E852,E851,U201","apcdischarge"
                        ))


seqDat<- seqecreate(id=eventDat$id, 
                  timestamp=eventDat$timeframe,
                  event=eventDat$event)

seqDat[1]

#Warning message:
# In seqecreate.internal(data = data, id = id, timestamp = timestamp,  :
# [!] some of your events contain '(', ')' or ',' characters. 
# The search of specific subsequences may not work properly.

# remove commas
# Fix events contain '(', ')' or ',' characters
eventDat <- eventDat %>%
  rowwise()%>%
  mutate(eventF = paste0(trimws(strsplit(as.character(event), ",")[[1]], "b"), 
                        collapse = "."))

#order by ID and time frame
eventDat <- eventDat %>%
  arrange(id, timeframe)

seqDat<- seqecreate(id=eventDat$id, 
                    timestamp=eventDat$timeframe,
                    event=eventDat$eventF)
seqDat[1]

正在产生的输出是

(I01.I02.I03.I17.I05.I16,noProc,T222.T511.T30.T12)-1-(apcdischarge)

但是我期望

(I01.I02.I03.I17.I05.I16,T222.T511.T30.T12,noProc)-1-(apcdischarge)

1 个答案:

答案 0 :(得分:1)

当不是因数时,event的{​​{1}}自变量通过默认按字母顺序对级别进行排序而被强制为因数。您可以通过传递自定义因子并将所需的水平顺序作为事件参数来设置自己的顺序。

我用你的例子来说明:

我们检索当前的关卡顺序并将“ noProc”事件设置为最后一个元素

seqecreate

现在,我们使用事件字母的期望顺序定义因子,并将其作为ev.list <- levels(factor(eventDat$eventF)) ev.alph <- c(ev.list[ev.list!="noProc"],"noProc") 参数传递

event