具有分类和数值特征的数据集的连续比率模型

时间:2019-07-09 09:17:41

标签: r regression glm categorical-data model-fitting

我正在尝试为我的数据集拟合一个具有数字和分类特征的连续比率模型。我的因变量是数字,它代表住院时间。另外,我将目标变量转换为RDocumentation中建议的有序变量。

我使用了以下函数:glmnetcr()glmpathcr()vglm()(我将“家庭”参数设置为“ cratio”),train()(我将“方法”参数=“ vglmContRatio”)。

它们都不适合我数据集中的变量类型。仅当我将所有变量都转换为数值形式时,这些方法才有效,但是在文档中未指定协变量矩阵应仅具有数值特征。

许多分类变量具有更高的级别,我对模型一无所知。我可以使用任何功能并保留分类变量吗?

这是我的变量类型:

$ Admission Type                     : Factor w/ 3 levels "Elective - Booked",..: 3 3 3 3 3 3 1 3 3 3 ...
 $ Treatment Function                 : Factor w/ 7 levels "BREAST SURGERY",..: 3 3 1 3 1 2 2 1 3 3 ...
 $ Main Health Care Provider Job Title: Factor w/ 25 levels "Colorectal Consultant",..: 4 4 6 4 7 8 4 7 4 4 ...
 $ Dominant Procedure OPCS Code       : Factor w/ 373 levels "A732","A752",..: 219 55 24 342 344 132 122 21 219 55 ...
 $ Procedure Code Desc                : Factor w/ 373 levels "Abdominolipectomy",..: 325 11 261 224 20 60 319 368 325 11 ...
 $ Primary Diagnosis ICD10 Code       : Factor w/ 330 levels "A099","A419",..: 220 169 32 162 30 183 16 32 220 139 ...
 $ Primary Diagnosis Desc             : Factor w/ 323 levels "Abdominal aortic aneurysm, without mention of rupture",..: 67 89 197 311 196 98 191 197 67 129 ...
 $ Secondary Diagnosis ICD10 Code     : Factor w/ 568 levels "A047","A099",..: 175 182 107 389 558 363 442 41 332 257 ...
 $ Secondary Diagnosis Desc           : Factor w/ 550 levels "Abdominal aortic aneurysm, without mention of rupture",..: 177 110 521 141 415 469 220 474 367 529 ...
 $ Theatre Description                : Factor w/ 30 levels "FGH Theatre 1",..: 12 12 16 12 16 21 21 16 12 12 ...
 $ ASA Code                           : Factor w/ 5 levels "1","2","3","4",..: 3 3 3 2 2 2 1 3 2 2 ...
 $ Anaesthetic Type                   : Factor w/ 11 levels "Block","Epidural Anaesthetic",..: 4 4 4 4 4 11 4 4 4 4 ...
 $ Was Operation Delayed?             : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
 $ Reason For Operation Delay         : Factor w/ 56 levels "Additions / Changes to List",..: 30 30 31 55 13 35 5 6 30 17 ...
 $ Is Readmission?                    : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ Did Patient Die?                   : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ Patient Age On Admit               : num  58 77 82 56 49 49 79 55 69 47 ...
 $ Patient Sex                        : Factor w/ 2 levels "Female","Male": 1 1 1 2 1 1 2 1 2 2 ...
 $ Patient Ethnicity                  : Factor w/ 15 levels "Any Other Ethnic Group",..: 14 14 14 11 14 11 14 14 14 14 ...
 $ Patient Deprivation Index Decile   : num  8 1 4 4 7 7 7 5 6 2 ...
 $ BMI                                : num  32.4 25.4 29.1 39.5 26.6 ...
 $ BMI Category                       : Factor w/ 4 levels "Healthy","Obese",..: 2 3 3 2 3 1 2 2 3 3 ...
 $ Smoking Status                     : Factor w/ 3 levels "Current Smoker",..: 2 2 2 1 3 2 2 1 2 3 ...
 $ Drinking Status                    : Factor w/ 2 levels "No","Yes": 2 1 2 2 2 1 2 2 2 2 ...
 $ OperationDuration                  : num  111 337 84 161 111 166 173 176 194 418 ...
 $ LOS                                : num  1 3 1 1 1 3 22 3 1 1 ...

0 个答案:

没有答案