有序数据的列联表

时间:2018-08-13 04:45:41

标签: r contingency

我有一个包含两个类别变量的数据集,其中一个需要排序(范围)

Extent          LOP
Partially       Other
Partially       Other
Not at all      Other
Not at all      Other
Substantially   Other
Substantially   Other
Substantially   LOPA
Fully           LOPA
Fully           Other

我想将其转换为像这样的扩展频率的列联表:

       Extent
  LOP  Not at all Partially Substantially Fully   
  LOPA      0          0         2          1
  Other     2          2         2          1

如果我使用with语句:

with(data, table(LOP, Extent))

我没有得到我的扩展级别的顺序,因为它默认为按字母顺序排列。

      Extent
  LOP Fully Not at all Partially Substantially    
  LOPA   1       0          0         2          
  Other  1       2          2         2          

我知道我可以手动创建此表,但是我的数据集太大而无法实现。您想在保留订单的同时实现更有效的方式吗?

1 个答案:

答案 0 :(得分:1)

我不确定您是如何获得要显示在主要帖子中的结果的;基数R的tabletidyr::spread都保留ordered变量的顺序。

这里是一个tidyverse选项,可以保留ordered变量的顺序。

library(tidyverse)
df %>%
    count(Extent, LOP) %>%
    spread(Extent, n, fill = 0)
## A tibble: 2 x 5
#  LOP   `Not at all` Partially Substantially Fully
#  <fct>        <dbl>     <dbl>         <dbl> <dbl>
#1 LOPA            0.        0.            1.    1.
#2 Other           2.        2.            2.    1.

实际上table也会保留订单

with(df, table(LOP, Extent))
#       Extent
#LOP     Not at all Partially Substantially Fully
#  LOPA           0         0             1     1
#  Other          2         2             2     1

所以我不确定您如何在主要帖子中获得结果。


样本数据

df <- read.table(text =
    "Extent          LOP
'Partially'       Other
'Partially'       Other
'Not at all'      Other
'Not at all'      Other
'Substantially'   Other
'Substantially'   Other
'Substantially'   LOPA
'Fully'           LOPA
'Fully'           Other", header = T)

df$Extent <- ordered(df$Extent, levels = c("Not at all", "Partially", "Substantially", "Fully"))