我正在尝试折叠因子水平,最初,count(a7_edu2)
输出显示折叠已经成功,但是当我检查结构并在RStudio视图中查看时,更改不会影响实际变量
关于保存为新变量或覆盖旧变量的任何建议?谢谢!
我使用fct_collapse
分解为三类,并尝试mutate()
创建具有新级别的新变量。我试过保存到一个新变量中,并且也将其保存为transmute()而不是mutate()。我会对新变量或替换旧变量感到满意。
mutate(a7_edu2 = fct_collapse(a7_edu2,
Highschool = c("Elm School", "Grade 7 or 8", "Grade 9 to 11", "High School Diploma", "G.E.D"),
Diploma = c("Diploma or Certificate from trade tech school" , "Diploma or Certificate from community college or CEGEP"),
Bachelors = c("Bachelor degree", "Degree (Medicine, Dentistry etc)", "Masters degree", "Doctorate")
)) %>%
count(a7_edu2) # this is the result I want but when i check the structure, it doesn't save!
str(SCI_dem$a7_edu2)
我希望输出结果为“具有4个级别的因素:“高中”,“文凭”,“学士”,“其他” 但取而代之的却是原始的“具有13个级别的因子”榆木学校”,“ 7或8年级”,..:8 7 6 10 7 7 8 3 7 10 ...”
更新的问题:它可以将一个变量保存到新的df(SCI_collpase
)中。但是,当我尝试将其他新的折叠变量保存到同一数据帧时,它会覆盖以前的折叠 ...我尝试指定新列SCI_collapse$edu
,但随后重命名了新的折叠变量df ... 如何折叠多个变量并将每个变量添加到新的df中?
有关保存或编写管道的建议?
SCI_collapse <- SCI_dem %>%
mutate(a7_edu2 = fct_collapse(a7_edu2,
Highschool = c("Elm School",
"Grade 7 or 8",
"Grade 9 to 11",
"High School Diploma",
"G.E.D"),
Diploma = c("Diploma or Certificate from trade tech school" ,
"Diploma or Certificate from community college or CEGEP"),
Bachelors = c("Bachelor degree",
"Degree (Medicine, Dentistry etc)",
"Masters degree", "Doctorate")))
答案 0 :(得分:0)
这就是我最终要做的:
# Collapse levels (education)
SCI_dem <- SCI_dem %>%
mutate(a7_edu2_col = fct_collapse(a7_edu2, # Save as new variable ending in _col
Highschool= c("Elm School", "Grade 7 or 8", "Grade 9 to 11", "High School Diploma", "G.E.D"),
Diploma = c("Diploma or Certificate from trade tech school" , "Diploma or Certificate from community college or CEGEP"),
Bachelors= c("Bachelor degree", "Degree (Medicine, Dentistry etc)", "Masters degree", "Doctorate"),
Other = c("Other", "Prefer not to answer")
), a7_edu2_col = droplevels(a7_edu2_col)) %>% # drop empty levels of _col
rename(a7_edu2_unc = a7_edu2)
我现在有 new 变量以_col
结尾,并且已将 old 变量重命名为以_unc
结尾(未折叠)。然后,我通过删除以_unc
结尾的列来进行清理。
SCI_dem <- select(SCI_dem, -ends_with("_unc"))
哪一个让我保持整洁,折叠的数据框:)