我最近开始根据需要开始学习R,到目前为止,我认为这很好。但我还处于早期阶段。然而,我面对R中的这一重大紧迫挑战,我将非常感谢一些帮助。我的编程技巧显然非常业余,并且绝对会接受我能得到的任何帮助。这是:
我目前只能进入第2步。我认为第3步是挑战的关键......
非常感谢您的期待。
到目前为止脚本:
> gdslist = c('GDS3715','GDS3716','GDS3717'...)#up to perhaps 100 datasets
> analysisfunc = function(gdsid) {
gdsdat = getGEO(gdsid,destdir=".")
gdseset = GDS2eSet(gdsdat)
pData(gdseset)$disease.state #Needed assignment, etc...Step 3 stuff ;Siggenes/SAM can perhaps be done here
return(sprintf("Results from %s should be here",gdsid))
}
> resultlist = sapply(gdslist,analysisfunc) #loop function
答案 0 :(得分:0)
这适用于所有gds数据集。
GEOSAM.analysis <- function( gdsid, destdir = getwd() ) {
require( 'GEOquery' )
require( 'siggenes' )
## test if gdsid is gdsid
if( length(grep('GDS', gdsid)) == 0 ){
stop()
}
gdsdat = getGEO( gdsid, destdir = destdir )
gdseset = GDS2eSet( gdsdat )
gdseset.pData <- pData( gdseset )
gds.factors <- names( gdseset.pData )
gds.factors[gds.factors == 'sample'] <- NA
gds.factors[gds.factors == 'description'] <- NA
gds.factors <- gds.factors[!is.na( gds.factors )]
cl.list <- sapply( gdseset.pData[gds.factors], as.character)
cl.list <- factor( apply( cl.list, 1, function(x){ paste( x , collapse = '-' )} ) )
if( length( levels ( cl.list ) ) == 2 ){
levels( cl.list ) <- 0:length( levels( cl.list ) )
} else {
levels( cl.list ) <- 1:length( levels( cl.list ) )
}
sam.gds <- sam( gdseset, cl.list )
results.file <- file.path( destdir, paste( gdsid, '.sam.gds.rdata', sep ='' ) )
save( sam.gds, file = results.file )
return( sprintf( "Results from %s are saved in '%s'. These can be loaded by 'load('%s')'.",gdsid, results.file, results.file ) )
}
gdslist = c('GDS3715', 'GDS3716', 'GDS3717')
resultlist = sapply(gdslist, GEOSAM.analysis)
print(resultlist)