我的格式如下file.csv
。我需要做的是比较LeftChr
和RightChr
列并获取uniqe组合并剥离chr以使result
附加t
与文件的每个唯一组合名称如下面的result
所示。
>Id LeftChr LeftPosition LeftStrand LeftLength RightChr
4465 chr1 33478980 + 60 chr1
4751 chr1 37908641 + 370 chr2
1690 chr1 37938262 - 112 chr5
4464 chr1 37938376 + 122 chr2
4463 chr2 59097215 + 675 chr2
结果
file.csv: t(1:1), t(1:2), t(1:5),t(2:2)
答案 0 :(得分:1)
假设您已将此内容读入名为data
的数据框:
x = with(data, unique(gsub(pattern = "chr",
replacement = "",
x = paste("t(", LeftChr, ":", RigthChr, ")"))))
paste("file.csv: ", paste(x, collapse = ", "))
答案 1 :(得分:1)
dat <- read.table(text="
Id LeftChr LeftPosition LeftStrand LeftLength RightChr
4465 chr1 33478980 + 60 chr1
4751 chr1 37908641 + 370 chr2
1690 chr1 37938262 - 112 chr5
4464 chr1 37938376 + 122 chr2
4463 chr2 59097215 + 675 chr2
", head=T, as.is=T)
dat %>%
mutate(lc=gsub("chr", "", LeftChr), rc=gsub("chr", "", RightChr)) %>%
select(lc, rc) %>%
group_by(lc, rc) %>%
unique
Source: local data frame [5 x 2]
# Groups: lc, rc [4]
#
# lc rc
# (chr) (chr)
# 1 1 1
# 2 1 2
# 3 1 5
# 4 2 2