Question

我在Little Book of R中找到了这个用于生物信息学的程序。链接：https://a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/src/chapter7.html

#finds start and stop codons in DNA sequence
#from Avril Coghlan, Little Book of R for Bioinformatics
library(Biostrings)
findPotentialStartsAndStops <- function(sequence)
{
  # Define a vector with the sequences of potential start and stop codons
  codons <- c("ATG", "TAA", "TAG", "TGA")
  # Find the number of occurrences of each type of potential start or     stop codon
  for (i in 1:4)
  {
    codon <- codons[i]
    # Find all occurrences of codon "codon" in sequence "sequence"
    occurrences <- matchPattern(codon, sequence)
    # Find the start positions of all occurrences of "codon" in sequence     "sequence"
    codonpositions <- attr(occurrences,"start")
    # Find the total number of potential start and stop codons in     sequence "sequence"
    numoccurrences <- length(codonpositions)
    if (i == 1)
    {
      # Make a copy of vector "codonpositions" called "positions"
      positions <- codonpositions
      # Make a vector "types" containing "numoccurrences" copies of     "codon"
      types <- rep(codon, numoccurrences)
    }
    else
    {
      # Add the vector "codonpositions" to the end of vector "positions":
      positions <- append(positions, codonpositions,     after=length(positions))
      # Add the vector "rep(codon, numoccurrences)" to the end of vector    "types":
      types <- append(types, rep(codon, numoccurrences),     after=length(types))
    }
  }
  # Sort the vectors "positions" and "types" in order of position along     the input sequence:
  indices <- order(positions)
  positions <- positions[indices]
  types <- types[indices]
  # Return a list variable including vectors "positions" and "types":
  mylist <- list(positions,types)
  return(mylist)
}

s1 <- "ACGGTATGTAATGTGA"
#tried as vector also s1 <- c("A", "C", "G", "G", "T", "A", "T", "G", "T", "A", "A", "T", "G", "T", "G", "A")

findPotentialStartsAndStops(s1)

如果我将DNA序列用作字符串，我会收到错误

    Error in .Method(..., na.last = na.last, decreasing = decreasing) : 
      argument 1 is not a vector
    7 .Method(..., na.last = na.last, decreasing = decreasing) 
    6 eval(expr, envir, enclos) 
    5 eval(.dotsCall, env) 
    4 eval(.dotsCall, env) 
    3 standardGeneric("order") 
    2 order(positions) 
    1 findPotentialStartsAndStops(s1) 
    Called from: (function () 
    {
        .rs.breakOnError(TRUE)
    })()

如果我使用DNA序列作为载体，我会收到错误

    Error in .Call2("new_XString_from_CHARACTER", classname, x, start(solved_SEW),  :   zero or more than one input sequence 
    8 .Call2("new_XString_from_CHARACTER", classname, x, start(solved_SEW), 
    width(solved_SEW), get_seqtype_conversion_lookup("B", seqtype), 
    PACKAGE = "Biostrings") 
    7 .charToXString(seqtype, x, start, end, width) 
    6 XString(NULL, subject) 
    5 XString(NULL, subject) 
    4 .XString.matchPattern(pattern, subject, max.mismatch, min.mismatch, 
    with.indels, fixed, algorithm) 
    3 matchPattern(codon, sequence) 
    2 matchPattern(codon, sequence) 
    1 findPotentialStartsAndStops(s1)

从代码中看，程序似乎期望DNA序列成为特征。

所以看起来可能问题就在于此发生＆lt; - matchPattern（密码子，序列）关于输入的东西是矢量还是应该是矢量？但密码子已经是一个载体，如果我要求上课（密码子），它就会显示为载体。我不明白什么是错的。

Answer 1

代码似乎已过时。当前版本的Biostrings（2.38.2）可能会返回一个与之前不同的对象。这条线

codonpositions <- attr(occurrences,"start")

应替换为

codonpositions <- start(occurrences)

错误参数1不是矢量

1 个答案: