从以下工作构造开始:
sum(sapply(DNAStringSet(seq_set[, 1]), function(s)
countPWM(motifs[[1]], reverseComplement(s), min.score = "75%")))
我写这个循环:
percentages <- as.character(seq(0, 100, 5))
for (i in 1:length(percentages)) {
sum(sapply(DNAStringSet(seq_set[, 1]), function(s)
countPWM(
motifs[[1]],
reverseComplement(s),
min.score = as.character(cat('"', percentages[i], "%" , '"', sep = "")
))))
}
并返回以下内容:
Error in .normargMinScore(min.score, pwm) :
'min.score' must be a single number or string
我确实知道,
的数据类型存在问题min.score
但是当我检查时:
test <- as.character(cat('"', percentages[1], "%" , '"', sep = ""))
typeof(test)
> typeof(test)
[1] "character"
它似乎是有序的。
我认为这可能与R-bloggers描述的类型强制有关,因为使用了sapply function
。但这似乎不对。
非常感谢帮助, 因为我还是R和编程新手
我的sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils
[7] datasets methods base
other attached packages:
[1] Biostrings_2.38.4 XVector_0.10.0 IRanges_2.4.8
[4] S4Vectors_0.8.11 BiocGenerics_0.16.1
loaded via a namespace (and not attached):
[1] zlibbioc_1.16.0 tools_3.2.5
这就是我构建数据的方式:
seq_set <- matrix(1:2000, 1000, 2)
seq_set[, 1] <-
sapply(seq_set[, 1], function(s)
paste(sample(
c('A', 'C', 'G', 'T'),
size = ncol(motifs[[1]]),
replace = T
), collapse = ''))
seq_set[, 2] <-
sapply(seq_set[, 2], function(s)
paste(sample(
c('A', 'C', 'G', 'T'),
size = ncol(motifs[[2]]),
replace = T
), collapse = ''))
这些是我的库中的包:
AnnotationDbi Annotation Database Interface
Biobase Biobase: Base functions for Bioconductor
BiocGenerics S4 generic functions for Bioconductor
BiocInstaller Install/Update Bioconductor, CRAN, and github Packages
BiocParallel Bioconductor facilities for parallel evaluation
Biostrings String objects representing biological sequences, and
matching algorithms
bitops Bitwise Operations
BSgenome Infrastructure for Biostrings-based genome data packages and
support for efficient SNP representation
caTools Tools: moving window statistics, GIF, Base64, ROC AUC, etc.
CNEr CNE Detection and Visualization
DBI R Database Interface
DirichletMultinomial Dirichlet-Multinomial Mixture Model Machine Learning for
Microbiome Data
futile.logger A Logging Utility for R
futile.options Futile options management
GenomeInfoDb Utilities for manipulating chromosome and other 'seqname'
identifiers
GenomicAlignments Representation and manipulation of short genomic alignments
GenomicRanges Representation and manipulation of genomic intervals and
variables defined along a genome
gtools Various R Programming Tools
IRanges Infrastructure for manipulating intervals on sequences
lambda.r Modeling Data with Functional Programming
Rcpp Seamless R and C++ Integration
RCurl General Network (HTTP/FTP/...) Client Interface for R
Rsamtools Binary alignment (BAM), FASTA, variant call (BCF), and tabix
file import
RSQLite SQLite Interface for R
rtracklayer R interface to genome browsers and their annotation tracks
S4Vectors S4 implementation of vectors and lists
seqLogo Sequence logos for DNA sequence alignments
snow Simple Network of Workstations
SummarizedExperiment SummarizedExperiment container
TFBSTools Software Package for Transcription Factor Binding Site
(TFBS) Analysis
TFMPvalue Efficient and Accurate P-Value Computation for Position
Weight Matrices
XML Tools for Parsing and Generating XML Within R and S-Plus
XVector Representation and manpulation of external sequences
zlibbioc An R packaged zlib-1.2.5
Packages in library ‘/usr/lib/R/library’:
base The R Base Package
boot Bootstrap Functions (Originally by Angelo Canty for S)
class Functions for Classification
cluster "Finding Groups in Data": Cluster Analysis Extended
Rousseeuw et al.
codetools Code Analysis Tools for R
compiler The R Compiler Package
datasets The R Datasets Package
foreign Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat,
Weka, dBase, ...
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours and Fonts
grid The Grid Graphics Package
KernSmooth Functions for Kernel Smoothing Supporting Wand & Jones
(1995)
lattice Trellis Graphics for R
MASS Support Functions and Datasets for Venables and Ripley's
MASS
Matrix Sparse and Dense Matrix Classes and Methods
methods Formal Methods and Classes
mgcv Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness
Estimation
nlme Linear and Nonlinear Mixed Effects Models
nnet Feed-Forward Neural Networks and Multinomial Log-Linear
Models
parallel Support for Parallel computation in R
rpart Recursive Partitioning and Regression Trees
spatial Functions for Kriging and Point Pattern Analysis
splines Regression Spline Functions and Classes
stats The R Stats Package
stats4 Statistical Functions using S4 Classes
survival Survival Analysis
tcltk Tcl/Tk Interface
tools Tools for Package Development
utils The R Utils Package
答案 0 :(得分:0)
尼古拉的评论很有用。
这样:
seq_set_matches <- matrix(1:42, 21, 2)
percentages <- as.character(seq(0, 100, 5))
for (i in 1:length(percentages)) {
seq_set_matches[i,1]<- sum(sapply(DNAStringSet(seq_set[, 1]), function(s)
countPWM(
motifs[[1]],
reverseComplement(s),
min.score = paste(percentages[i], "%" , sep = "")
)))
}
的工作原理。亲爱的尼古拉,如果你愿意,我很乐意接受你的帮助作为正式答案。再次感谢。