我有一组患者数据df
,我试图在R中取消标识。
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("Andrew",
"Jim", "Kurt", "Lester", "Mickey", "Taylor"), class = "factor"),
heart_rate = c(78L, 82L, 67L, 105L, 85L, 94L), age = c(35L,
23L, 43L, 52L, 33L, 45L), partner = structure(c(5L, 2L, 6L,
1L, 3L, 4L), .Label = c("Andrew", "Jim ", "Kurt ", "Lester ",
"Mickey ", "Taylor "), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
我想基于名为name
的对象的partner
列替换id
和key
列的名称
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("Andrew",
"Jim", "Kurt", "Lester", "Mickey", "Taylor"), class = "factor"),
id = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
我可以使用此代码取消标识name
列
df[["name"]] <- key[ match(df[['name']], key[['name']] ) , 'id']
但是,当我尝试使用此代码取消标识partner
列
df[["partner"]] <- key[ match(df[['partner']], key[['name']] ) , 'id']
我的数据框看起来像这样
structure(list(name = structure(c(2L, 5L, 1L, 6L, 4L, 3L), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor"), heart_rate = c(78L,
82L, 67L, 105L, 85L, 94L), age = c(35L, 23L, 43L, 52L, 33L, 45L
), partner = structure(c(NA, NA, NA, 1L, NA, NA), .Label = c("A3",
"J9", "K5", "L4", "M4", "T7"), class = "factor")), row.names = c(NA,
-6L), class = "data.frame")
有人有什么建议吗?可能只适用于一行中的数据集的所有列的方法的加分点和代码的说明,将不胜感激。
答案 0 :(得分:2)
问题在于,在partner
的{{1}}列中,大多数单词后面都有一个空格:
df
这意味着.Label = c("Andrew", "Jim ", "Kurt ", "Lester ", "Mickey ", "Taylor ")
除了名称“ Andrew”(它会正确返回该索引)之外,找不到完全匹配的内容。
解决此问题的方法是使用
从match()
列中删除空格
partner
然后您的代码可以正常工作:
df$partner = trimws(df$partner)