标识具有最低值的列

时间:2018-06-06 21:10:34

标签: r

我有一个由以下

给出的数据框
DF <- structure(list(ID = c(1, 129, 169, 1087), `Collab Years Patents` = c(NA, 
"2011, 2011, 2011", "2010", "2006, 2006"), `Collab Years Publications` = c("2011", 
"2015, 2016, 2016", "2010", NA), ECP = c("2011", "2011", "2010", 
"2006")), .Names = c("ID", "Collab Years Patents", "Collab Years Publications", 
"ECP"), row.names = c(1L, 107L, 136L, 859L), class = "data.frame")

ECP列是两个协作列的最小年份(可能包含几年)。我需要一个输出,说明ECP属于哪一列。例如,上面的解决方案可能是带有元素的上一帧的另一个列向量:

    structure(list(ID = c(1, 129, 169, 1087), `Collab Years Patents` = c(NA, 
"2011, 2011, 2011", "2010", "2006, 2006"), `Collab Years Publications` = c("2011", 
"2015, 2016, 2016", "2010", NA), ECP = c("2011", "2011", "2010", 
"2006"), identifier = c("Publications", "Patents", "Both", "Patents"
)), .Names = c("ID", "Collab Years Patents", "Collab Years Publications", 
"ECP", "identifier"), row.names = c(1L, 107L, 136L, 859L), class = "data.frame")

2 个答案:

答案 0 :(得分:3)

以下是使用str_detect的选项。循环浏览协作列(sapply(DF[2:3],),使用str_detect检查列中的哪一列具有&#39; ECP&#39;的值。乘以col将TRUE值转换为列索引,将NA元素替换为0,根据最大列索引获取列名对应,删除列名前缀部分{ {1}},并在&#39; m1&#39;中分配这些元素大于0,即有ECP&#39;在两者中都是&#39;在创建的矢量&#39; v1&#39;

sub

答案 1 :(得分:0)

使用tidyversedplyrpurrr):

library(tidyverse)

DF %>%
  mutate_at(2:3,strsplit,", ") %>%
  transmute(identifier = pmap(.[2:4],~c("Publications","Patents","Both")[
    2*(..3 %in% .x) + (..3 %in% .y)])) %>%
  bind_cols(DF,.)

#     ID Collab Years Patents Collab Years Publications  ECP   identifier
# 1    1                 <NA>                      2011 2011 Publications
# 2  129     2011, 2011, 2011          2015, 2016, 2016 2011      Patents
# 3  169                 2010                      2010 2010         Both
# 4 1087           2006, 2006                      <NA> 2006      Patents