Question

我使用cdplot（R）绘制了变量的条件密度分布。我的自变量和我的因变量不是独立的。自变量是离散的（它只需要0到3之间的某些值），因变量也是离散的（从0到1的11个级别，步长为0.1）。

一些数据：

dat <- read.table( text="y           x
3.00     0.0
2.75     0.0
2.75     0.1
2.75     0.1
2.75     0.2
2.25     0.2
3        0.3
2        0.3
2.25     0.4
1.75     0.4
1.75     0.5
2        0.5
1.75     0.6
1.75     0.6
1.75     0.7
1        0.7
0.54     0.8
0        0.8
0.54     0.9
0        0.9
0        1.0
0        1.0", header=TRUE, colClasses="factor")

我想知道我的变量是否适合进行这种分析。

此外，我想知道如何以优雅的方式报告这一结果，具有学术和统计意义。

Answer 1

这是一个使用rms - packages`rrm函数的运行，它通常用于二进制结果，但也处理有序的分类变量：

library(rms) # also loads Hmisc
      # first get data in the form you described
dat[] <- lapply(dat, ordered)  # makes both columns ordered factor variables

?lrm   
#read help page ... Also look at the supporting book and citations on that page
lrm( y ~ x, data=dat)
# --- output------
Logistic Regression Model

 lrm(formula = y ~ x, data = dat)


 Frequencies of Responses

    0 0.54    1 1.75    2 2.25 2.75    3 3.00 
    4    2    1    5    2    2    4    1    1 

                        Model Likelihood        Discrimination       Rank Discrim.    
                           Ratio Test              Indexes              Indexes       
 Obs             22    LR chi2      51.66    R2             0.920    C       0.869    
 max |deriv| 0.0004    d.f.            10    g             20.742    Dxy     0.738    
                       Pr(> chi2) <0.0001    gr    1019053402.761    gamma   0.916    
                                             gp             0.500    tau-a   0.658    
                                             Brier          0.048                     

         Coef     S.E.     Wald Z Pr(>|Z|)
 y>=0.54  41.6140 108.3624  0.38  0.7010  
 y>=1     31.9345  88.0084  0.36  0.7167  
 y>=1.75  23.5277  74.2031  0.32  0.7512  
 y>=2      6.3002   2.2886  2.75  0.0059  
 y>=2.25   4.6790   2.0494  2.28  0.0224  
 y>=2.75   3.2223   1.8577  1.73  0.0828  
 y>=3      0.5919   1.4855  0.40  0.6903  
 y>=3.00  -0.4283   1.5004 -0.29  0.7753  
 x       -19.0710  19.8718 -0.96  0.3372  
 x=0.2     0.7630   3.1058  0.25  0.8059  
 x=0.3     3.0129   5.2589  0.57  0.5667  
 x=0.4     1.9526   6.9051  0.28  0.7773  
 x=0.5     2.9703   8.8464  0.34  0.7370  
 x=0.6    -3.4705  53.5272 -0.06  0.9483  
 x=0.7   -10.1780  75.2585 -0.14  0.8924  
 x=0.8   -26.3573 109.3298 -0.24  0.8095  
 x=0.9   -24.4502 109.6118 -0.22  0.8235  
 x=1     -35.5679 488.7155 -0.07  0.9420

还有MASS::polr功能，但我发现Harrell的版本更平易近人。这也可以通过秩回归来实现。 quantreg包非常标准，如果这是您选择的路线。看看你的另一个问题，我想知道你是否尝试过逻辑变换作为线性化这种关系的方法。当然，lrm与有序变量的说明用法是“幕后”的逻辑转换。

条件密度分布，两个离散变量

1 个答案: