我想使用线性回归将患者的诊断数量建模为 age
和 diagnosis1
的函数吗?
当我尝试执行代码 lm(age~diagnosis1, data = df),
我收到此错误:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'
我已经用模式插补对 NA 进行了插补,因此不应该有任何 NA。
这是数据的dput:
structure(list(patient_nbr = c(3069L, 4302L, 5220L, 5220L, 5220L,
5220L), encounter_id = c(36469686L, 85907514L, 7981038L, 33503946L,
55397514L, 60892254L), admission_source_id = structure(c(1L,
1L, 15L, 1L, 1L, 15L), .Label = c("1", "10", "11", "13", "14",
"17", "2", "20", "22", "25", "3", "4", "5", "6", "7", "8", "9"
), class = "factor"), admission_type_id = structure(c(2L, 3L,
1L, 2L, 2L, 1L), .Label = c("1", "2", "3", "4", "5", "6", "7",
"8"), class = "factor"), discharge_disposition_id = structure(c(1L,
1L, 1L, 1L, 23L, 1L), .Label = c("1", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "2", "20", "22", "23", "24",
"25", "27", "28", "3", "4", "5", "6", "7", "8", "9"), class = "factor"),
medical_specialty = c("Cardiology", "Orthopedics-Reconstructive",
"InternalMedicine", "Cardiology", "InternalMedicine", "InternalMedicine"
), payer_code = c("?", "?", "?", "?", "?", "?"), readmitted = c("NO",
"NO", ">30", ">30", ">30", "NO"), time_in_hospital = c(8L,
1L, 2L, 11L, 8L, 1L), number_diagnoses = c(9L, 7L, 9L, 9L,
9L, 9L), num_medications = c(31L, 9L, 14L, 19L, 15L, 12L), num_procedures = c(6L,
1L, 0L, 4L, 3L, 0L), number_emergency = c(0L, 0L, 0L, 0L,
0L, 0L), number_inpatient = c(0L, 0L, 0L, 2L, 1L, 2L), number_outpatient = c(0L,
0L, 0L, 0L, 0L, 0L), age = c("[60-70)", "[70-80)", "[60-70)",
"[70-80)", "[70-80)", "[70-80)"), gender = c("Male", "Female",
"Male", "Male", "Male", "Male"), race = c("Caucasian", "Caucasian",
"Caucasian", "Caucasian", "Caucasian", "Caucasian"), weight = c("?",
"?", "?", "?", "?", "?"), diagnosis1 = c("circulatory", "musculoskeletal",
"EMI", "circulatory", "skin", "EMI"), diagnosis2 = c("genitourinary",
"EMI", "circulatory", "circulatory", "EMI", "skin"), diagnosis3 = c("blood",
"circulatory", "digestive", "EMI", "circulatory", "circulatory"
)), row.names = 18:23, class = "data.frame")
答案 0 :(得分:1)
age
是一个字符,您可以将其转换为 factor
:
lm(factor(age)~diagnosis1, data = data)
Call:
lm(formula = as.numeric(factor(age)) ~ diagnosis1, data = data)
Coefficients:
(Intercept) diagnosis1EMI diagnosis1musculoskeletal diagnosis1skin
1.500e+00 -4.567e-16 5.000e-01 5.000e-01