从R中的两列获得独特的组合

时间:2015-12-07 01:54:04

标签: r

我的格式如下file.csv。我需要做的是比较LeftChrRightChr列并获取uniqe组合并剥离chr以使result附加t与文件的每个唯一组合名称如下面的result所示。

>Id LeftChr LeftPosition    LeftStrand  LeftLength  RightChr
4465    chr1    33478980    +   60  chr1
4751    chr1    37908641    +   370 chr2
1690    chr1    37938262    -   112 chr5
4464    chr1    37938376    +   122 chr2
4463    chr2    59097215    +   675 chr2

结果

file.csv:  t(1:1), t(1:2), t(1:5),t(2:2)

2 个答案:

答案 0 :(得分:1)

假设您已将此内容读入名为data的数据框:

x = with(data, unique(gsub(pattern = "chr",
                       replacement = "",
                       x = paste("t(", LeftChr, ":", RigthChr, ")"))))

paste("file.csv: ", paste(x, collapse = ", "))

答案 1 :(得分:1)

dat <- read.table(text="
Id LeftChr LeftPosition    LeftStrand  LeftLength  RightChr
4465    chr1    33478980    +   60  chr1
4751    chr1    37908641    +   370 chr2
1690    chr1    37938262    -   112 chr5
4464    chr1    37938376    +   122 chr2
4463    chr2    59097215    +   675 chr2
", head=T, as.is=T)

dat %>% 
  mutate(lc=gsub("chr", "", LeftChr), rc=gsub("chr", "", RightChr)) %>%
  select(lc, rc) %>%
  group_by(lc, rc) %>%
  unique
Source: local data frame [5 x 2]
# Groups: lc, rc [4]
#
#      lc    rc
#   (chr) (chr)
# 1     1     1
# 2     1     2
# 3     1     5
# 4     2     2