r中的索引值不匹配

时间:2015-06-12 14:47:01

标签: r indexing

在我做任何更改之前,我总结了表格。然后我提取数据,用空值和“I或II NOS”排除数据,并分别分配给a1和a2。 a1拥有正确的数据。但它表明a2仍有4个“I或II NOS”数据。当我尝试索引原始表的“I或II NOS”数据时,它给出了10行,但是4行的值不是“I或II NOS”。这是怎么发生的?有人能帮助我吗?我没有足够的声誉来粘贴结果screenprint图片所以我只粘贴代码。提前谢谢。

a1 = a[AJCC_PATHOLOGIC_TUMOR_STAGE!='',]

a2 = a1[AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',]

抱歉,我更新了问题并粘贴了整个代码。

library("cgdsr", lib.loc="~/R/win-library/3.1")
library("R.oo", lib.loc="~/R/win-library/3.1")
library("R.methodsS3", lib.loc="~/R/win-library/3.1")
# Create CGDS object
mycgds = CGDS("http://www.cbioportal.org/public-portal/")
test(mycgds)
# Get list of cancer studies at server
getCancerStudies(mycgds)[, c(1,2)]

mycancerstudy = getCancerStudies(mycgds)[78,1]
# Get available case lists (collection of samples) for a given cancer study
getCaseLists(mycgds,mycancerstudy)[,1]

mycaselist = getCaseLists(mycgds,mycancerstudy)[2,1]

# Get available genetic profiles
getGeneticProfiles(mycgds,mycancerstudy)[,1]

mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[2,1]

# Get clinical data for the case list
myclinicaldata = getClinicalData(mycgds,mycaselist)

# skcm_tcga_rna_seq_v2_mrna_median_Zscores
z_score_caselist = getCaseLists(mycgds,mycancerstudy)[7,1]

# Get data slices for a specified list of genes, genetic profile and case list
WNT5A = getProfileData(mycgds,c('WNT5A'),mygeneticprofile,mycaselist)

# documentation
help('cgdsr')
help('CGDS')

WNT5A_stage = merge(WNT5A,myclinicaldata, by = 'row.names')
WNT5A_stage_table = WNT5A_stage[, c(2, 6)]
a = na.omit(WNT5A_stage_table)
a1 =  a[a$AJCC_PATHOLOGIC_TUMOR_STAGE!='']
a2 = a1[AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',]

只需更新部分结果,如下所示。您可以看到该值与索引不同。

>a1[AJCC_PATHOLOGIC_TUMOR_STAGE=='I or II NOS',]
        WNT5A         AJCC_PATHOLOGIC_TUMOR_STAGE
  8     712.1645                 I or II NOS
 28      7.5434                 I or II NOS
 33      3.6290                 I or II NOS
 34      8.7881                 I or II NOS
 38    150.3167                 I or II NOS
 47     34.3643                 I or II NOS
 180    19.1529                    Stage IB
 304    20.1072                   Stage IIC
 324    44.0167                    Stage IB
 337 19142.6676                  Stage IIIC

1 个答案:

答案 0 :(得分:0)

如我的评论中所示,您未使用新数据框中的列进行子设置。你需要:

a2 = a1[a1$AJCC_PATHOLOGIC_TUMOR_STAGE!='I or II NOS',]

a2 = subset(a1, AJCC_PATHOLOGIC_TUMOR_STAGE != 'I or II NOS')