我有一个包含colnames的大型数据集:
"chromosome" "start" "end" "h.gene" "CPCN_LUNG" "NCIH524_LUNG" "SBC5_LUNG" "NCIH446_LUNG" "NCIH196_LUNG"
"NCIH209_LUNG" "NCIH1963_LUNG" "NCIH211_LUNG" "NCIH2196_LUNG" "NCIH526_LUNG" "NCIH82_LUNG" "SW1271_LUNG" "DMS114_LUNG" "NCIH2029_LUNG" "NCIH2066_LUNG" "NCIH1341_LUNG"
"NCIH2227_LUNG" "NCIH69_LUNG" "NCIH1048_LUNG" "DMS53_LUNG" "SHP77_LUNG" "NCIH1836_LUNG" "NCIH2141_LUNG" "COLO668_LUNG" "NCIH1105_LUNG" "NCIH1876_LUNG" "NCIH841_LUNG"
"DMS273_LUNG" "CORL279_LUNG" "NCIH1092_LUNG" "CORL95_LUNG" "CORL88_LUNG" "NCIH1694_LUNG" "NCIH1436_LUNG"
我想在此数据集上创建GRange对象。
reference_GRange <- GRanges(seqnames= reference$chromosome,IRanges(start= reference$start,end= reference$end),h.gene=reference$h.gene)
这将创建仅包含2个元数据列的Grange对象。有没有办法用参考表中的所有信息创建Grange对象。 [例如]从h.gene,CPCN_LUNG,NCIH524_LUNG,.........到NCIH1436_LUNG的元数据栏
答案 0 :(得分:5)
将makeGRangesFromDataFrame()
与keep.extra.columns=TRUE
一起使用。或者,如上所述创建GRanges
,然后添加mcols()
删除不感兴趣的列。
mcols(gr) = reference[,-(1:3)]
随意在Bioconductor support forum上询问有关Bioconductor封装的问题。
答案 1 :(得分:0)
reference_GRange&lt; - GRanges(seqnames = reference $ chromosome,IRanges(start = reference $ start,end = reference $ end),h.gene = reference $ h.gene,CPCN_LUNG = reference $ CPCN_LUNG,NCIH524_LUNG = reference $ NCIH524_LUNG,..... NCIH1436_LUNG =参考$ NCIH1436_LUNG)。
但是要在GRnage对象中手动添加每个额外的列,这可能是hazzle job !!!