我有一个数据框架,其中包含成千上万个项目代码的向量,每个项目代码代表不同类型的研究。这是一个示例:
Data <- data.frame(Assignment = c("C-209", "B-543", "G-01", "LOG"))
作业代码的首字母表示研究类型。 C =制图,B =生物学,G =地质,LOG =后勤。
我想创建一个新列,以查看“工作分配”列的第一个字母,并使用它来表示其研究类型。
我已经尝试过与此线程类似的操作,但是我知道我缺少了一些内容:
R - Creating New Column Based off of a Partial String
Data <- data.frame(Assignment = c("C-209", "B-543", "G-01", "LOG"))
Types <- data.frame(Type = c("Cartography", "Biology", "Geology","Logistic"),
stringsAsFactors = FALSE)
Data %>%
mutate(Type = str_match(Assignment, Types$Type)[1,])
答案 0 :(得分:1)
您可以在Types data.frame中添加新的Code列,然后将其与原始表连接。您还需要在Data data.frame中创建一个Code列。
library(dplyr)
library(stringr)
Data <- data.frame(Assignment = c("C-209", "B-543", "G-01", "LOG"))
Types <- data.frame(Type = c("Cartography", "Biology", "Geology","Logistic"),
Code = c("C","B","G","L"), # Create new column here
stringsAsFactors = FALSE)
Data <- Data %>% mutate(Code = substr(Assignment,1L,1L)) # extract first character
Data <- left_join(Data, Types, by = "Code") %>% select(Assignment, Type) # combine