对模型诊断数的线性回归

时间:2021-04-21 22:00:58

标签: r

我想使用线性回归将患者的诊断数量建模为 agediagnosis1 的函数吗?

当我尝试执行代码 lm(age~diagnosis1, data = df),

我收到此错误:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'

我已经用模式插补对 NA 进行了插补,因此不应该有任何 NA。

这是数据的dput:

structure(list(patient_nbr = c(3069L, 4302L, 5220L, 5220L, 5220L, 
5220L), encounter_id = c(36469686L, 85907514L, 7981038L, 33503946L, 
55397514L, 60892254L), admission_source_id = structure(c(1L, 
1L, 15L, 1L, 1L, 15L), .Label = c("1", "10", "11", "13", "14", 
"17", "2", "20", "22", "25", "3", "4", "5", "6", "7", "8", "9"
), class = "factor"), admission_type_id = structure(c(2L, 3L, 
1L, 2L, 2L, 1L), .Label = c("1", "2", "3", "4", "5", "6", "7", 
"8"), class = "factor"), discharge_disposition_id = structure(c(1L, 
1L, 1L, 1L, 23L, 1L), .Label = c("1", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "2", "20", "22", "23", "24", 
"25", "27", "28", "3", "4", "5", "6", "7", "8", "9"), class = "factor"), 
    medical_specialty = c("Cardiology", "Orthopedics-Reconstructive", 
    "InternalMedicine", "Cardiology", "InternalMedicine", "InternalMedicine"
    ), payer_code = c("?", "?", "?", "?", "?", "?"), readmitted = c("NO", 
    "NO", ">30", ">30", ">30", "NO"), time_in_hospital = c(8L, 
    1L, 2L, 11L, 8L, 1L), number_diagnoses = c(9L, 7L, 9L, 9L, 
    9L, 9L), num_medications = c(31L, 9L, 14L, 19L, 15L, 12L), num_procedures = c(6L, 
    1L, 0L, 4L, 3L, 0L), number_emergency = c(0L, 0L, 0L, 0L, 
    0L, 0L), number_inpatient = c(0L, 0L, 0L, 2L, 1L, 2L), number_outpatient = c(0L, 
    0L, 0L, 0L, 0L, 0L), age = c("[60-70)", "[70-80)", "[60-70)", 
    "[70-80)", "[70-80)", "[70-80)"), gender = c("Male", "Female", 
    "Male", "Male", "Male", "Male"), race = c("Caucasian", "Caucasian", 
    "Caucasian", "Caucasian", "Caucasian", "Caucasian"), weight = c("?", 
    "?", "?", "?", "?", "?"), diagnosis1 = c("circulatory", "musculoskeletal", 
    "EMI", "circulatory", "skin", "EMI"), diagnosis2 = c("genitourinary", 
    "EMI", "circulatory", "circulatory", "EMI", "skin"), diagnosis3 = c("blood", 
    "circulatory", "digestive", "EMI", "circulatory", "circulatory"
    )), row.names = 18:23, class = "data.frame")

1 个答案:

答案 0 :(得分:1)

age 是一个字符,您可以将其转换为 factor

lm(factor(age)~diagnosis1, data = data)

Call:
lm(formula = as.numeric(factor(age)) ~ diagnosis1, data = data)

Coefficients:
              (Intercept)              diagnosis1EMI  diagnosis1musculoskeletal             diagnosis1skin  
                1.500e+00                 -4.567e-16                  5.000e-01                  5.000e-01