我有一个由以下
给出的数据框DF <- structure(list(ID = c(1, 129, 169, 1087), `Collab Years Patents` = c(NA,
"2011, 2011, 2011", "2010", "2006, 2006"), `Collab Years Publications` = c("2011",
"2015, 2016, 2016", "2010", NA), ECP = c("2011", "2011", "2010",
"2006")), .Names = c("ID", "Collab Years Patents", "Collab Years Publications",
"ECP"), row.names = c(1L, 107L, 136L, 859L), class = "data.frame")
ECP列是两个协作列的最小年份(可能包含几年)。我需要一个输出,说明ECP属于哪一列。例如,上面的解决方案可能是带有元素的上一帧的另一个列向量:
structure(list(ID = c(1, 129, 169, 1087), `Collab Years Patents` = c(NA,
"2011, 2011, 2011", "2010", "2006, 2006"), `Collab Years Publications` = c("2011",
"2015, 2016, 2016", "2010", NA), ECP = c("2011", "2011", "2010",
"2006"), identifier = c("Publications", "Patents", "Both", "Patents"
)), .Names = c("ID", "Collab Years Patents", "Collab Years Publications",
"ECP", "identifier"), row.names = c(1L, 107L, 136L, 859L), class = "data.frame")
答案 0 :(得分:3)
以下是使用str_detect
的选项。循环浏览协作列(sapply(DF[2:3],
),使用str_detect
检查列中的哪一列具有&#39; ECP&#39;的值。乘以col
将TRUE值转换为列索引,将NA
元素替换为0,根据最大列索引获取列名对应,删除列名前缀部分{ {1}},并在&#39; m1&#39;中分配这些元素大于0,即有ECP&#39;在两者中都是&#39;在创建的矢量&#39; v1&#39;
sub
答案 1 :(得分:0)
使用tidyverse
(dplyr
和purrr
):
library(tidyverse)
DF %>%
mutate_at(2:3,strsplit,", ") %>%
transmute(identifier = pmap(.[2:4],~c("Publications","Patents","Both")[
2*(..3 %in% .x) + (..3 %in% .y)])) %>%
bind_cols(DF,.)
# ID Collab Years Patents Collab Years Publications ECP identifier
# 1 1 <NA> 2011 2011 Publications
# 2 129 2011, 2011, 2011 2015, 2016, 2016 2011 Patents
# 3 169 2010 2010 2010 Both
# 4 1087 2006, 2006 <NA> 2006 Patents