我已将问题最小化为仅包含字符的数据集(df_sum)
"LPC(20:1) uM" "LPE(16:0) uM" "LPE(16:1) uM" "LPE(18:0) uM" "LPE(18:1) uM" "PA(32:1) uM" "PA(34:1) uM"
"PA(36:1) uM" "PS(34:1) uM" "PS(36:1) uM" "PG(34:1) uM" "PG(36:1) uM" "PE(28:0) uM" "PE(30:1) uM"
"LPC(20:1)" "LPE(16:0)" "LPE(16:1)" "LPE(18:0)" "LPE(18:1)" "PA(32:1)" "PA(34:1)"
"PS(36:1)" "PG(34:1)"
如您所见,有些值是相同的,但末尾带有一个额外的标记“ uM”。
我的目标是在不删除uM标签的情况下找到唯一且实际上相同的值(我尝试过的操作,例如df_sum <- sub(" uM", "", df_sum)
)
任何帮助将不胜感激
答案 0 :(得分:0)
好的,我已经完成了。这是我使用的代码:
names.um <- names(df_sum[,names(dplyr::select(df_sum, dplyr::contains("uM")))]) #select 'uM' names from joint dataset
names.um <- sub(" uM", "", names.um )#remove the 'uM' tag
names.filou <- names(df_sum[,names(dplyr::select(df_sum, dplyr::ends_with(")")))])#select 'Filou' names from joint dataset
pos.filou <- which(!names.filou %in% names.um)#(1)find possitions where values from 'Filou' don't match the ones from 'uM'
pos.um <- which(!names.um %in% names.filou)#(2)find possitions where values from 'uM' don't match the ones from 'Filou'
names.filou[pos.filou]#show values from (1)
names.um[pos.um]#show values from (2)