我有要列入矢量的列列表。列元素可以是名称或字符串“0”。我想获得一个列的元素列表,这些元素在名为df$keywords
的字符向量中有一个名称。我在下面粘贴了一个示例数据框。我希望它成为
df$keywords[1,]
将是一个空载体
df$keywords[2,]
将是(ACT Science
,study skills
,MCAT
)
任何帮助将不胜感激
structure(list(V31 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "0", class = "factor"), V32 = structure(c(1L,
2L, 4L, 5L, 7L, 8L, 6L, 5L, 3L, 3L), .Label = c("0", "ACT Science",
"English", "Microsoft PowerPoint", "physics", "proofreading",
"reading", "writing"), class = "factor"), V33 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "0", class = "factor"),
V34 = structure(c(1L, 7L, 5L, 5L, 8L, 2L, 6L, 5L, 3L, 4L), .Label = c("0",
"geography", "Italian", "literature", "prealgebra", "SAT reading",
"study skills", "trigonometry"), class = "factor"), V35 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "0", class = "factor"),
V36 = structure(c(1L, 3L, 4L, 4L, 7L, 2L, 6L, 4L, 5L, 5L), .Label = c("0",
"English", "MCAT", "precalculus", "proofreading", "SAT writing",
"writing"), class = "factor"), V37 = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "0", class = "factor"),
V38 = structure(c(1L, 1L, 5L, 5L, 2L, 1L, 4L, 5L, 3L, 6L), .Label = c("0",
"English", "GED", "physical science", "reading", "spelling"
), class = "factor")), .Names = c("V31", "V32", "V33", "V34",
"V35", "V36", "V37", "V38"), class = "data.frame", row.names = c(NA,
-10L))
答案 0 :(得分:3)
假设您的数据已分配到x
,那么以下内容可实现我认为您所追求的目标:
apply(x, 1, function(r) {tmp <- unique(r); tmp[tmp != 0]})
apply
适用于数据框的每一行,获取每行中的唯一元素并删除0
个条目。结果是不同长度的向量列表,每行有唯一的非零元素。
答案 1 :(得分:1)
在第一篇文章中,我没有正确理解所需的输出,A 稍微不同的方法是在这样的行中使用%in%
运算符:
df$keywords <- apply(df,1, function(x) c( x[! x %in% "0"]))
df$keywords
# keywords
#1
#2 ACT Science, study skills, MCAT, ACT Science, study skills, MCAT
#3 Microsoft PowerPoint, prealgebra, precalculus, reading, Microsoft PowerPoint, prealgebra, precalculus, reading
#4 physics, prealgebra, precalculus, reading, physics, prealgebra, precalculus, reading
#5 reading, trigonometry, writing, English, reading, trigonometry, writing, English
#6 writing, geography, English, writing, geography, English
#7 proofreading, SAT reading, SAT writing, physical science, proofreading, SAT reading, SAT writing, physical science
#8 physics, prealgebra, precalculus, reading, physics, prealgebra, precalculus, reading
#9 English, Italian, proofreading, GED, English, Italian, proofreading, GED
#10 English, literature, proofreading, spelling, English, literature, proofreading, spelling
如果你想要每行unique
套技能,只需添加命令unique
,如下:
df$keywords <- apply(df,1, function(x) c( unique(x[ ! x %in% "0" ] ) ) )
df["keywords"]
# keywords
#1
#2 ACT Science, study skills, MCAT
#3 Microsoft PowerPoint, prealgebra, precalculus, reading
#4 physics, prealgebra, precalculus, reading
#5 reading, trigonometry, writing, English
#6 writing, geography, English
#7 proofreading, SAT reading, SAT writing, physical science
#8 physics, prealgebra, precalculus, reading
#9 English, Italian, proofreading, GED
#10 English, literature, proofreading, spelling