从snakemake从iRODS服务器获取数据

时间:2018-11-12 17:42:12

标签: snakemake

我正在尝试使用snakemake从iRODS服务器下载文件。

我已按照here的说明进行操作。

我正在使用GNU / Linux计算机集群。

我已经在Snakefile的顶部找到了这个

vector_analyse <- function(sample_vector){
  # ----------------------------------------------------------------------------
  # Signature: vector --> vector
  # Author: kon_u
  # Description: Given a sample vector of 0s and 1s, return a sequence of 1s in 
  # the data you need to increase the number of children by 1 (when there are less 
  # 5 0s in between them, then it is the same child and not a new child)
  # ----------------------------------------------------------------------------

  # ----------------------------------------------------------------------------
  # Run Length Encoding gives a list of length and values
  # ----------------------------------------------------------------------------
  rle_object <- rle(sample_vector)
  x <- rle_object$lengths # original length 
  y <- rle_object$values # original values
  z <- which(y == 1) # index of 1 in vector y
  if (length(z) == 1){
    invisible()
  } else{
    for (i in 2:length(z)){
      if (x[z[i]-1] >= 5){
        y[z[i]] = y[z[i]]
      } else {
        y[z[i]] = y[z[i]] - 1
      }
    }
  }
  y_cumsum = cumsum(y)
  rle_object$values <- y_cumsum
  new_vector = inverse.rle(rle_object)
  return(new_vector)
}

vector_analyse(c(1,1,1,1,0,0,0,0)) # 1 1 1 1 1 1 1 1
vector_analyse(c(0,0,0,0,1,1,1,1,0,0,0,0,0,1,1,1)) # 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2
vector_analyse(c(0,0,0,0,1,1,1,1,0,0,1,1,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1)) # 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2

此代码可在Python解释器中顺利运行,而不会引起错误。

然后我有这个规则:

from snakemake.remote.iRODS import RemoteProvider

irods = RemoteProvider(irods_env_file='/nfs/users/nfs_c/username/.irods/irods_environment.json',
                       timezone="Europe/London") # all parameters are optional

在yaml文件中,我已经知道了:

rule iget:
    input:
        irodsBam = irods.remote('/seq/path/to/my/file.bam')
    output:
        lustreBam = 'new/file/location/on/lustre.bam'
    conda:
        'envs/iget.yaml'
    shell:
        'iget -K -f {input.irodsBam} {output.lutreBam}'

然后我用以下命令运行了蛇行游戏

channels:
  - bioconda
  - r
dependencies:
  - boto
  - moto
  - filechunkio
  - pysftp
  - dropbox
  - requests
  - ftputil
  - XRootD
  - biopython

我有一个错误:

snakemake --use-conda new/file/location/on/lustre.bam

但是,我检查了ils,我的BAM文件位于iRODS上的指定位置。

请问有人可以帮我吗?

更新2018年11月13日

我也尝试过以下规则:

Building DAG of jobs...
MissingInputException in line 146 of /lustre/projects/main/Snakefile:
Missing input files for rule igetPacbio:
/seq/path/to/my/file.bam

但错误是相同的:

rule iget:
    input:
        irodsBam = irods.remote('/seq/path/to/my/file.bam')
    output:
        'new/file/location/on/lustre.bam'
    conda:
        'envs/iget.yaml'
    shell:
        r"""
        touch {output}
        """

更新2018年11月24日

MissingInputException

给予

ils /seq/pacbio/r54097_20170511_114349/1_A01/

并删除'/seq/path/to/my/file.bam'中的'/'不会更改错误。仍然是MissingInputException。

0 个答案:

没有答案