我有大量的mzML文件格式的数据。使用最新版本的R(v3.3.2)和最新的每日RStudio(v1.1.47),读取mzML文件会导致RStudio中的R崩溃,但终端中的R不会崩溃。
library(mzR)
library(msdata)
mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
package = "msdata")
aa <- openMSfile(mzxml) # this works
mzml <- system.file("microtofq/MM8.mzML", package = "msdata")
bb <- openMSfile(mzml) # this crashes R, but only in RStudio
sessionInfo()
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] msdata_0.14.0 mzR_2.8.1 Rcpp_0.12.9
loaded via a namespace (and not attached):
[1] ProtGenerics_1.6.0 parallel_3.3.2 Biobase_2.34.0
[4] codetools_0.2-15 BiocGenerics_0.20.0
更新
附加lldb
运行(顺便说一句,确保以root用户身份运行!),给出以下堆栈跟踪:
error: mzR.so 0x010c47ab: DW_TAG_member '_M_local_buf' refers to type 0x0110cd75 which extends beyond the bounds of 0x010c47a3
error: mzR.so 0x00efe9cd: DW_TAG_member '_M_local_buf' refers to type 0x00f369c9 which extends beyond the bounds of 0x00efe9c5
error: mzR.so 0x000000cc: DW_TAG_member '_M_local_buf' refers to type 0x0000a52f which extends beyond the bounds of 0x000000c4
* thread #1: tid = 3799, 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189, name = 'rsession', stop reason = signal SIGSEGV: invalid address (fault address: 0x83e6de7)
* frame #0: 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189
frame #1: 0x00007f2b01ed4aaf mzR.so`pwiz::msdata::IO::HandlerMSData::startElement(this=0x00007ffcdab7b9f0, name=<unavailable>, attributes=<unavailable>, position=<unavailable>) + 511 at IO.cpp:2666
frame #2: 0x00007f2b01fd1593 mzR.so`pwiz::minimxml::SAXParser::(anonymous namespace)::HandlerWrangler::startElement(this=0x00007ffcdab7b5b0, name="mzML", attributes=0x00007ffcdab7b558, position=45) const + 147 at SAXParser.cpp:211
frame #3: 0x00007f2b01fd2cfa mzR.so`pwiz::minimxml::SAXParser::parse(is=0x00000000066528c0, handler=0x00007ffcdab7b9f0) + 2810 at SAXParser.cpp:531
frame #4: 0x00007f2b01ec1927 mzR.so`pwiz::msdata::IO::read(is=0x00000000066528c0, msd=0x0000000004acc510, spectrumListFlag=IgnoreSpectrumList) + 3671 at IO.cpp:2766
frame #5: 0x00007f2b01e5747b mzR.so`pwiz::msdata::Serializer_mzML::Impl::read(this=0x000000000501ff30, is=shared_ptr<std::basic_istream<char, std::char_traits<char> > > @ 0x00007ffcdab7c8e0, msd=0x0000000004acc510) const + 107 at Serializer_mzML.cpp:223
frame #6: 0x00007f2b01e57b2a mzR.so`pwiz::msdata::Serializer_mzML::read(this=<unavailable>, is=<unavailable>, msd=<unavailable>) const + 58 at Serializer_mzML.cpp:250
frame #7: 0x00007f2b01e43964 mzR.so`pwiz::msdata::Reader_mzML::read(this=<unavailable>, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head=<unavailable>, result=0x0000000004acc510, runIndex=<unavailable>, config=<unavailable>) const + 948 at DefaultReaderList.cpp:148
frame #8: 0x00007f2b01e5d855 mzR.so`pwiz::msdata::ReaderList::read(this=0x00000000058e1170, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=", result=0x0000000004acc510, sampleIndex=0, config=0x00007ffcdab7cb10) const + 181 at Reader.cpp:101
frame #9: 0x00007f2b01eea60f mzR.so`pwiz::msdata::(anonymous namespace)::(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", msd=0x0000000004acc510, reader=0x00000000058e1170, head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=")(const string &const, pwiz::msdata::MSData &const, const pwiz::msdata::Reader &const, const string &const) + 127 at MSDataFile.cpp:61
frame #10: 0x00007f2b01eec0ba mzR.so`pwiz::msdata::MSDataFile::MSDataFile(this=0x0000000004acc510, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", reader=<unavailable>, calculateSourceFileChecksum=<unavailable>) + 218 at MSDataFile.cpp:91
frame #11: 0x00007f2b01ee434a mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [inlined] pwiz::msdata::RAMPAdapter::Impl::Impl(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", this=0x0000000004acc510) + 5 at RAMPAdapter.cpp:49
frame #12: 0x00007f2b01ee4345 mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(this=0x00000000050b8110, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 37 at RAMPAdapter.cpp:296
frame #13: 0x00007f2b01d3df7e mzR.so`rampOpenFile(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 542 at ramp.cpp:284
frame #14: 0x00007f2b01d3cd15 mzR.so`cRamp::cRamp(this=0x00000000050e4ce0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 149 at cramp.cpp:75
frame #15: 0x00007f2b01d45af8 mzR.so`RcppRamp::open(this=0x0000000004fd0ec0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 72 at RcppRamp.cpp:23
frame #16: 0x00007f2b01d5a955 mzR.so`Rcpp::CppMethod2<RcppRamp, void, char const*, bool>::operator(this=0x00000000050c9990, object=0x0000000004fd0ec0, args=<unavailable>)(RcppRamp*, SEXPREC**) + 245 at Module_generated_CppMethod.h:215
frame #17: 0x00007f2b01d570f0 mzR.so`Rcpp::class_<RcppRamp>::invoke_void(this=<unavailable>, method_xp=<unavailable>, object=0x0000000005cbe020, args=0x00007ffcdab7d2f0, nargs=<unavailable>) + 176 at class.h:212
frame #18: 0x00007f2b027e3f41 Rcpp.so`CppMethod__invoke_void(args=<unavailable>) + 449 at Module.cpp:200
frame #19: 0x00007f2b26a6e9b1 libR.so`do_External(call=0x000000000570f208, op=0x000000000333ac20, args=0x0000000005dd50b8, env=0x0000000005dd50f0) + 337 at dotcode.c:548
frame #20: 0x00007f2b26aa86df libR.so`Rf_eval(e=0x000000000570f208, rho=0x0000000005dd50f0) + 1871 at eval.c:713
frame #21: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x000000000570f198, op=0x00000000033254d8, args=0x000000000570f358, rho=0x0000000005dd50f0) + 344 at eval.c:1807
frame #22: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x0000000005dd50f0) + 1345 at eval.c:685
frame #23: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
frame #24: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x00000000062d0ad8, rho=0x000000000595b3e0) + 797 at eval.c:732
frame #25: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d0d08, op=0x00000000033254d8, args=0x00000000062d0b10, rho=0x000000000595b3e0) + 344 at eval.c:1807
frame #26: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #27: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #28: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d04a0, op=0x00000000033254d8, args=0x00000000062d0e90, rho=0x000000000595b3e0) + 344 at eval.c:1807
frame #29: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #30: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
frame #31: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x000000000595a9f0, rho=0x000000000334ff58) + 797 at eval.c:732
frame #32: 0x00007f2b26aabf66 libR.so`do_set(call=0x000000000595b680, op=0x000000000330c2d8, args=<unavailable>, rho=0x000000000334ff58) + 166 at eval.c:2197
frame #33: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000334ff58) + 1345 at eval.c:685
frame #34: 0x00007f2b26acf932 libR.so`Rf_ReplIteration(rho=0x000000000334ff58, savestack=<unavailable>, browselevel=<unavailable>, state=0x00007ffcdab7f020) + 546 at main.c:258
frame #35: 0x00007f2b26acfca1 libR.so`R_ReplConsole(rho=0x000000000334ff58, savestack=0, browselevel=0) + 129 at main.c:308
frame #36: 0x00007f2b26acfd58 libR.so`run_Rmainloop + 72 at main.c:1059
frame #37: 0x0000000000d85dc2 rsession`rstudio::r::session::runEmbeddedR(rstudio::core::FilePath const&, rstudio::core::FilePath const&, bool, bool, SA_TYPE, rstudio::r::session::Callbacks const&, rstudio::r::session::InternalCallbacks*) + 434
frame #38: 0x0000000000d6791d rsession`rstudio::r::session::run(rstudio::r::session::ROptions const&, rstudio::r::session::RCallbacks const&) + 9581
frame #39: 0x00000000006a7299 rsession`main + 10809
frame #40: 0x00007f2b2530e830 libc.so.6`__libc_start_main(main=(rsession`main), argc=11, argv=0x00007ffcdab81fd8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcdab81fc8) + 240 at libc-start.c:291
frame #41: 0x00000000006b4fe1 rsession`_start + 41
这绝对看起来像Proteowizard方面的内存问题,在基础R
进程中会以某种方式被忽略。
答案 0 :(得分:3)
当使用使用消毒剂编译的R版本运行时,我看到:
> source("msdata.R", echo = TRUE)
> library(mzR)
Loading required package: Rcpp
> library(msdata)
> mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
+ package = "msdata")
> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in
> mzml <- system.file("microtofq/MM8.mzML", package = "msdata")
> bb <- openMSfile(mzml) # this crashes R, but only in RStudio
特别是这一点:
> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in
意味着openMSfile()
函数可能出现问题 - 它似乎试图以无效的偏移量读取数据。我会向mzR维护者提交此问题。
> sessionInfo()
R Under development (unstable) (2017-01-17 r72002)
Platform: x86_64-apple-darwin16.3.0 (64-bit)
Running under: macOS Sierra 10.12.2
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] msdata_0.15.0 mzR_2.9.3 Rcpp_0.12.9
[4] testthat_1.0.2 rmarkdown_1.3 knitr_1.15.1
[7] roxygen2_5.0.1 devtools_1.12.0.9000
loaded via a namespace (and not attached):
[1] magrittr_1.5 BiocGenerics_0.21.3 pkgload_0.0.0.9000
[4] R6_2.2.0 stringr_1.1.0 tools_3.4.0
[7] pkgbuild_0.0.0.9000 parallel_3.4.0 Biobase_2.35.0
[10] withr_1.0.2 htmltools_0.3.5 ProtGenerics_1.7.0
[13] rprojroot_1.2 digest_0.6.11 crayon_1.3.2
[16] codetools_0.2-15 memoise_1.0.0 evaluate_0.10
[19] stringi_1.1.2 compiler_3.4.0 backports_1.0.5
编辑:这是用于构建R的配置调用;所有构建的包也使用相同的编译器+标志。 (摘自R.home("etc/Makeconf")
)
# R was configured using the following call
# (not including env. vars and site configuration)
# configure '--with-blas=-L/usr/local/opt/openblas/lib -lopenblas' '--with-lapack=-L/usr/local/opt/lapack/lib -llapack' '--with-cairo' '--disable-R-framework' '--enable-R-shlib' '--with-readline' '--enable-R-profiling' '--enable-memory-profiling' '--with-valgrind-instrumentation=2' '--without-internal-tzcode' '--prefix=/Users/kevin/r/r-devel-sanitizers' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig'
CC = clang-3.9 -std=gnu99 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
CXX = clang++-3.9 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
EDIT V2:鉴于lldb
堆栈跟踪,看起来罪魁祸首可能确实是Boost版本之间的冲突(RStudio使用的捆绑版本与mzR
使用的版本)。请注意,mzR
现在无意中调用了rsession
可执行文件中的Boost例程,当它可能打算调用自己的捆绑版本时。