如何找到R中有序因子的两个向量之间的水平差异?

时间:2015-07-18 02:43:17

标签: r ordinal

假设我有两个有序因子startend,它们的长度相同且使用相同的级别。如何返回一个向量,向量显示每个元素从start更改为end的级别数?

例如,假设我们有:

start = 'C5',  NA,   'C3',   'C5',   'T1'
end =   'C5', 'C5',   NA  ,  'C6',   'C6'
Levels: C2 < C3 < C4 < C5 < C6 < C7 < C8 < T1 < T2 < T3 < T4 < T5 < T6 < T7 < T8 < T9 < T10 < T11 < T12 < S1 < S2 < S3 < S45

我理想的想要的是像end - start这样简单的东西给我c(0, NA, NA, 1, -3)

以下是设置上述示例的代码

lvls<-c("C2","C3","C4","C5","C6","C7","C8",
        "T1","T2","T3","T4","T5","T6","T7","T8","T9","T10","T11","T12",
        "S1","S2","S3","S45")
start<-c('C5',  NA,   'C3',   'C5',   'T1') 
start<-ordered(start,levels=lvls)

end<-c('C5', 'C5',   NA  , 'C6',   'C6')
end<-ordered(end,levels=lvls)

1 个答案:

答案 0 :(得分:0)

发现我认为这是一种不必要的复杂方式。我很乐意接受一个更简单,更优雅的答案。

> start_lvls<-sapply(start,function(elem){match(TRUE,c(elem==levels(elem)))})
> end_lvls<-sapply(end,function(elem){match(TRUE,c(elem==levels(elem)))})
> end_lvls-start_lvls
[1]  0 NA NA  1 -3

由于我有数百个有序因素,并且因为我希望能够分析其中任何两个因素之间的差异,我现在使用它们来让我的生活更轻松:

# Returns a vector that is the numeric value of what level each ordered factor in orderedFactorVec is
numericLevels<-function(orderedFactorVec){
  return(sapply(orderedFactorVec,function(elem){match(TRUE,c(elem==levels(elem)))}))
}

# Returns a vector that is the numeric difference between the two given ordered factor vectors (i.e end - start)
levelsChange<-function(start,end){
 return(numericLevels(end)-numericLevels(start)) 
}