我有一个数据框(200x300),它由混合(字符,数字)变量组成,并且有很多缺失值(NA)
我的第一个问题是如何将所有数据转换为数字,我可以使用因子,但有100列可以转换。
其次,我的所有列都没有以等效单位表示。
在开始分析之前,我只想要一些很好的建议来准备数据
以下是数据的结构
structure(list(Hormonal.cycle.status..P4. = c(1, 1, 4, 1, 4,
1), Hormonal.medication.status = c(2, 1, 1, 2, 1, 1), Hormonal.medication.type = c(21,
27, 27, 26, 27, 27), ID.pathologist.main = c(3, 3, 3, 4, 2, 1
), ID.pathologist.sub = c(2, 1, 2, 2, 2, 2), Day.of.the.cycle_calculated = c(10,
8, 22, 19, 19, 12), Cycle.status..histology.and.cycle.day. = c(12,
18, 9, 1, 18, 3), Cycle.status.final..P4..histology..cycle.day. = c(1,
4, 5, 1, 6, 3), Deep.lesion = c(2, 2, 1, 2, 1, 2), Ovarian.lesion = c(2,
2, 2, 1, 2, 2), Peritoneal.lesion = c(2, 2, 2, 2, 2, 2), Combination.of.lesions = c(1,
1, 7, 4, 7, 1), DEEP.all.types = c(2, 2, NA, 2, NA, 2), DEEP.uterosacral = c(2,
2, NA, 2, NA, 2), DEEP.RVE = c(1, 1, 1, 2, 1, 1), DEEP.bowel = c(1,
1, 1, 1, 1, 1), DEEP.bladder = c(1, 1, 1, 1, 1, 1), Ovarian.endo.cyst = c(5,
5, 3, 1, 3, 5), Peritoneal = c(2, NA, 2, NA, 2, 2), Peritoneal.size.total = c(3,
3, 3, 3, 3, 2), Date.of.the.surgery = c(96, 98, 17, 105, 107,
108), Type.of.surgery = c(1, 1, 1, 1, 1, 1), Perit..surface = c(1,
2, 3, 3, 2, 2), Perit..deep = c(3, NA, NA, NA, 3, 3), R.ovary.surface = c(NA,
1, 2, 1, 1, NA), R.ovary.deep = c(NA, NA, 2, NA, 3, NA), L.ovary.surface = c(2,
NA, 2, NA, 1, NA), L.ovary.deep = c(2, 4, 4, NA, 4, 4), F.d.block = c(NA,
NA, 1, 1, NA, 1), R.ovary.frail = c(NA, NA, 3, NA, NA, 3), R.ovary.tight = c(NA,
NA, NA, NA, 3, NA), L.ovary.frail = c(NA, NA, NA, 2, NA, NA),
L.ovary.tight = c(2, 2, 3, NA, 2, 2), R.tuba.frail = c(NA,
NA, NA, NA, NA, 2), R.tuba.tight = c(NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), L.tuba.frail = c(NA,
NA, NA, NA, NA, 2), L.tuba.tight = c(4, NA, NA, NA, 4, NA
), Elsewhere = c(35, NA, NA, 24, NA, NA), Other.diseases = c(16,
NA, NA, NA, NA, 11), In.microarray = c(2, 2, 1, 2, 2, 1),
In.cytokine.plexes = c(2, 2, 2, 2, 2, 2)), .Names = c("Hormonal.cycle.status..P4.",
"Hormonal.medication.status", "Hormonal.medication.type", "ID.pathologist.main",
"ID.pathologist.sub", "Day.of.the.cycle_calculated", "Cycle.status..histology.and.cycle.day.",
"Cycle.status.final..P4..histology..cycle.day.", "Deep.lesion",
"Ovarian.lesion", "Peritoneal.lesion", "Combination.of.lesions",
"DEEP.all.types", "DEEP.uterosacral", "DEEP.RVE", "DEEP.bowel",
"DEEP.bladder", "Ovarian.endo.cyst", "Peritoneal", "Peritoneal.size.total",
"Date.of.the.surgery", "Type.of.surgery", "Perit..surface", "Perit..deep",
"R.ovary.surface", "R.ovary.deep", "L.ovary.surface", "L.ovary.deep",
"F.d.block", "R.ovary.frail", "R.ovary.tight", "L.ovary.frail",
"L.ovary.tight", "R.tuba.frail", "R.tuba.tight", "L.tuba.frail",
"L.tuba.tight", "Elsewhere", "Other.diseases", "In.microarray",
"In.cytokine.plexes"), row.names = c("H003", "H004", "H006",
"H007", "H008", "H011"), class = "data.frame")