我有两个长度为15000的命名数字向量,met
和nor
,其中存在一些相似的名称。例如:
head(met)
ALB IGKJ1 IGKC IGKJ4 IGKJ2 IGHG2
25.75415 20.55957 18.28749 17.87589 17.22944 16.60235
head(nor)
SAA1 CRP RNVU SNORD68 CYP1A2 IGKJ1
25.74548 24.05058 16.72566 15.05746 13.75348 10.74111
如果met
中存在剂量nor
,并且每个met
值的1.5*nor
大于其对应的nor
值,我希望对其进行分组。
在上面的示例中,通过我想要的比较IGKJ1
将是唯一的输出。
我该如何编码?
答案 0 :(得分:1)
library(dplyr)
# get named vectors
met = c(25.75415, 20.55957, 18.28749, 17.87589, 17.22944, 16.60235)
names(met) = c("ALB", "IGKJ1", "IGKC", "IGKJ4", "IGKJ2", "IGHG2")
nor = c(25.74548, 24.05058, 16.72566, 15.05746, 13.75348, 10.74111)
names(nor) = c("SAA1", "CRP", "RNVU", "SNORD68", "CYP1A2", "IGKJ1")
# transform them as data frames
dt_met = data.frame(v_met = met)
dt_met$names = row.names(dt_met)
dt_nor = data.frame(v_nor = nor)
dt_nor$names = row.names(dt_nor)
将名称和两个值保存为新数据帧的行的第一个选项:
# keep names as a dataset
dt_met %>%
inner_join(dt_nor, by="names") %>% # keep names that exist in both datsets
filter(v_met > 1.5*v_nor) %>% # keep rows where the condition is satisfied
select(names, everything()) # order columns
# names v_met v_nor
# 1 IGKJ1 20.55957 10.74111
第二个选项,只保留通过您的条件的名称,然后使用它们对原始矢量进行子集化:
# save names as a vector
dt_met %>%
inner_join(dt_nor, by="names") %>%
filter(v_met > 1.5*v_nor) %>%
pull(names) -> new_names
# subset met using those names
met[names(met) %in% new_names]
# IGKJ1
# 20.55957