当我尝试根据配对基因条件合并不同的基因表达结果时,这是我特别做的噩梦之一,这是我的合并数据框:
knowngene1 Logfold1 Gene1 knowngene2 Logfold2 Gene2
uc001ezv.3 5.1167021111 NA uc001ezu.1 5.6262305191 FLG
uc001ihe.4 4.1338871783 LOC100216001 uc001ihg.3 3.9475325801 NA
uc001iki.4 9.9902455211 CELF2 uc001ikn.2 9.3321964303 NA
uc001ikk.2 10.3059806111 CELF2 uc001ikn.2 9.3321964303 NA
uc001ikl.4 9.9890468379 CELF2 uc001ikn.2 9.3321964303 NA
uc001ikn.2 9.8293484977 NA uc001iki.4 9.4401488053 CELF2
uc001ikn.2 9.8293484977 NA uc001ikk.2 9.2887954663 CELF2
uc001ikn.2 9.8293484977 NA uc001ikl.4 9.4401488053 CELF2
uc001ikn.2 9.8293484977 NA uc010qbi.2 8.6399349792 CELF2
uc001ikn.2 9.8293484977 NA uc010qbj.1 9.2887954663 CELF2
uc001ezu.1 5.6262305191 FLG uc001ezv.3 5.1167021111 NA
uc001ihg.3 3.9475325801 NA uc001ihe.4 4.1338871783 LOC100216001
uc001iki.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA
uc001ikk.2 9.2887954663 CELF2 uc001ikn.2 9.8293484977 NA
uc001ikl.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA
uc001ikn.2 9.3321964303 NA uc001iki.4 9.9902455211 CELF2
uc001ikn.2 9.3321964303 NA uc001ikk.2 10.3059806111 CELF2
uc001ikn.2 9.3321964303 NA uc001ikl.4 9.9890468379 CELF2
uc001ikn.2 9.3321964303 NA uc010qbi.2 10.3865530025 CELF2
uc001ikn.2 9.3321964303 NA uc010qbj.1 10.3072927485 CELF2
uc001iot.1 6.9068905956 NA uc001iou.2 8.4040043896 VIM
uc001iou.2 10.4420548632 VIM uc001iot.1 5.8235197903 NA
uc001ipd.3 4.4693510978 ST8SIA6 uc001ipf.1 5.1931857169 NA
uc001kgd.3 3.5469561781 NA uc009xts.3 4.0607448636 IFIT2
uc001kgf.3 3.3975573789 IFIT3 uc001kgd.3 3.2512633588 NA
关键是我想要删除不重复的行,当然没有,我想删除那些在knowngene1和knongene2中已经更改了knowngene accessor的行。让我举个例子,第一个是我要保留的行
uc001ikn.2 9.8293484977 NA uc001iki.4 9.4401488053 CELF2
对我来说这些下一行是相同的,实际上第一行是我要保留的那个镜面图像,尽管它的表达式值或多或少都在同一范围内
uc001iki.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA
uc001ikn.2 9.3321964303 NA uc001ikl.4 9.9890468379 CELF2
所以我的想法是只保留我看到的第一个并删除下一个。你有什么想法吗?
答案 0 :(得分:1)
您要删除uc001ikn.2
出现的所有行吗?如果是这样,我认为这将有效:
Rgames> foo
[,1] [,2]
[1,] 1 7
[2,] 2 8
[3,] 3 9
[4,] 2 3
[5,] 4 1
[6,] 3 10
[7,] 5 11
[8,] 6 12
Rgames> foo[!duplicated(foo[,1])&!(foo[,2]%in%duplicated(foo[,1])),]
[,1] [,2]
[1,] 1 7
[2,] 2 8
[3,] 3 9
[4,] 5 11
[5,] 6 12
在您的情况下,您可以使用df$knowngene1
和df$knowngene2
列。