我的TraMineR序列图超出了设置的数据范围。为什么这样做,我该如何解决?

时间:2019-06-19 13:08:56

标签: r traminer

我的问题可以很容易地概括为以下两张图片:

序列图: enter image description here

序列频率:

enter image description here

尽管数据集在2018年结束,但仍超出了x轴。我真的希望找出原因。

我尝试将时间段限制为一年,直到2017年结束。没有变化。我虽然有点菜鸟,所以我的想法受到限制。我的猜测可能是它与“ NA”类别混淆了。但是总体情节看起来很正常。只是群集图和序列频率图超出了x轴。

MyData <- read.csv2(file="e:/Dokumente (-videoedluxe)/IBP dokumente/Msc/seq/mappe1.csv", sep=";", skipNul=TRUE, stringsAsFactors=FALSE, na = "empty")

str(MyData, sep=",")

install.packages("TraMineR")
Yes
library("TraMineR")

table(data$x1989)

data.seq <-seqdef(MyData, var = 2:31, ext = TRUE, gaps="NA"
                  alphabet=c("GOVspec","GOVinv","GOVno","IOspec","IOinv","IOno","ICspec","ICinv","ICno","MNCspec","MNCinv","MNCno","NGOspec","NGOspec","NGOinv","NGOno","NPOspec","NPOinv","NPOno","UNIspec","UNIinv","UNIno","EDUspec","EDUinv","EDUno", NA),
                  states = c("GOVspec","GOVinv","GOVno","IOspec","IOinv","IOno","ICspec","ICinv","ICno","MNCspec","MNCinv","MNCno","NGOspec","NGOspec","NGOinv","NGOno","NPOspec","NPOinv","NPOno","UNIspec","UNIinv","UNIno","EDUspec","EDUinv","EDUno", NA)
cpal(data.seq) <-c("aquamarine2","aquamarine3","aquamarine4","chocolate2","chocolate3","chocolate4","cadetblue2","cadetblue3","cadetblue4","gold1","gold3","gold4","green2","green3","green4","hotpink2","hotpink3","hotpink4","orange2","orange3","orange4","purple2","purple4","rosybrown","orchid2", "white")
seqstatl(MyData)
summary(data.seq)

years = c(1989:2018)

par(mfrow = c(1, 2))
seqdplot(data.seq, with.legend = FALSE, border = NA, x = years)
seqlegend(data.seq)

cost.constant <- seqsubm(data.seq, method="CONSTANT", time.varying= T, with.miss = FALSE)
cost.trate <- seqsubm(data.seq, method="TRATE", time.varying= T, with.miss = FALSE)

seqfplot(data.seq, withlegend="FALSE")
seqmtplot(data.seq, withlegend="RIGHT", title="Mean Time", 

analysis.manual <- seqdist(data.seq, method="OM", sm="TRATE", indel=1.5)
library(cluster)
analysis.manual = agnes(analysis.manual)
clusterward <- agnes(data.seq, method="ward")
plot(clusterward, which.plots = 2)

plot(analysis.trate, which.plots = 8)

## CLUSTER ANALYSIS

cluster1 = cutree(analysis.trate, 1)
cluster2 = cutree(analysis.trate, 2)
cluster3 = cutree(analysis.trate, 3)
cluster3 = cutree(analysis.trate, 4)

# Distribution plot
seqdplot(data.seq, group= cluster1, withlegend = F, border = NA, x = years)

# Index plots
seqIplot(data.seq, group= cluster1, withlegend = F, border = NA, x = years)
seqIplot(data.seq, group= cluster2, withlegend = F, border = NA, x = years)
seqIplot(data.seq, group= cluster3, withlegend = F, border = NA, x = years)
seqIplot(data.seq, group= cluster4, withlegend = F, border = NA, x = years)

我希望剧情在应该的时候结束。

1 个答案:

答案 0 :(得分:0)

由于我们没有数据,因此无法重现该问题。但是,我发现您的字母包含两倍于状态readonly的字母。删除其中之一并适应性地调整widgetNGOspec参数应该可以解决此问题。