我有以下数据框:
library(tidyverse)
dat <- structure(list(seq_name = c("Peptide_set1.r1", "Peptide_set2.r1"
), peptide = c("KSKLRHGC", "AAYVYVNQF"
)), .Names = c("seq_name", "peptide"), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
dat
#> # A tibble: 2 x 2
#> seq_name peptide
#> <chr> <chr>
#> 1 Peptide_set1.r1 KSKLRHGC
#> 2 Peptide_set2.r1 AAYVYVNQF
我想要做的是将它们转换为这个向量列表:
$Peptide_set1.r1
[1] "K" "S" "K" "L" "R" "H" "G" "C"
$Peptide_set2.r1
[[1] "A" "A" "Y" "V" "Y" "V" "N" "Q" "F"
我该怎么做?
答案 0 :(得分:3)
我们可以使用strsplit
拆分每个字符的字符串,并使用setnames
分配名称
setNames(strsplit(dat$peptide, ""), dat$seq_name)
#$Peptide_set1.r1
#[1] "K" "S" "K" "L" "R" "H" "G" "C"
#$Peptide_set2.r1
#[1] "A" "A" "Y" "V" "Y" "V" "N" "Q" "F"
要使用列索引而不是名称,我们可以使用pull
将列值转换为vector,因为这是一个tibble
library(dplyr)
setNames(strsplit(pull(dat[2]), ""), pull(dat[1]))
#$Peptide_set1.r1
#[1] "K" "S" "K" "L" "R" "H" "G" "C"
#$Peptide_set2.r1
#[1] "A" "A" "Y" "V" "Y" "V" "N" "Q" "F"
我们可以将它们完全添加到dplyr
链操作中
library(tidyverse)
dat1 <- dat %>% mutate(new = setNames(strsplit(pull(dat[2]), ""), pull(dat[1])))
dat1$new
#$Peptide_set1.r1
#[1] "K" "S" "K" "L" "R" "H" "G" "C"
#$Peptide_set2.r1
#[1] "A" "A" "Y" "V" "Y" "V" "N" "Q" "F"
正如@thelatemail所评论的那样,我们可以使用[[
代替pull
setNames(strsplit(dat[[2]], ""), dat[[1]])