读取mzML文件时RStudio崩溃,R终端没有

时间:2017-01-19 17:04:03

标签: r rstudio

我有大量的mzML文件格式的数据。使用最新版本的R(v3.3.2)和最新的每日RStudio(v1.1.47),读取mzML文件会导致RStudio中的R崩溃,但终端中的R不会崩溃。

library(mzR)
library(msdata)

mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
                     package = "msdata")
aa <- openMSfile(mzxml) # this works

mzml <- system.file("microtofq/MM8.mzML", package = "msdata")

bb <- openMSfile(mzml) # this crashes R, but only in RStudio

sessionInfo()

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] msdata_0.14.0 mzR_2.8.1     Rcpp_0.12.9  

loaded via a namespace (and not attached):
[1] ProtGenerics_1.6.0  parallel_3.3.2      Biobase_2.34.0     
[4] codetools_0.2-15    BiocGenerics_0.20.0

更新

附加lldb运行(顺便说一句,确保以root用户身份运行!),给出以下堆栈跟踪:

error: mzR.so 0x010c47ab: DW_TAG_member '_M_local_buf' refers to type 0x0110cd75 which extends beyond the bounds of 0x010c47a3
error: mzR.so 0x00efe9cd: DW_TAG_member '_M_local_buf' refers to type 0x00f369c9 which extends beyond the bounds of 0x00efe9c5
error: mzR.so 0x000000cc: DW_TAG_member '_M_local_buf' refers to type 0x0000a52f which extends beyond the bounds of 0x000000c4
* thread #1: tid = 3799, 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189, name = 'rsession', stop reason = signal SIGSEGV: invalid address (fault address: 0x83e6de7)
  * frame #0: 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189
    frame #1: 0x00007f2b01ed4aaf mzR.so`pwiz::msdata::IO::HandlerMSData::startElement(this=0x00007ffcdab7b9f0, name=<unavailable>, attributes=<unavailable>, position=<unavailable>) + 511 at IO.cpp:2666
    frame #2: 0x00007f2b01fd1593 mzR.so`pwiz::minimxml::SAXParser::(anonymous namespace)::HandlerWrangler::startElement(this=0x00007ffcdab7b5b0, name="mzML", attributes=0x00007ffcdab7b558, position=45) const + 147 at SAXParser.cpp:211
    frame #3: 0x00007f2b01fd2cfa mzR.so`pwiz::minimxml::SAXParser::parse(is=0x00000000066528c0, handler=0x00007ffcdab7b9f0) + 2810 at SAXParser.cpp:531
    frame #4: 0x00007f2b01ec1927 mzR.so`pwiz::msdata::IO::read(is=0x00000000066528c0, msd=0x0000000004acc510, spectrumListFlag=IgnoreSpectrumList) + 3671 at IO.cpp:2766
    frame #5: 0x00007f2b01e5747b mzR.so`pwiz::msdata::Serializer_mzML::Impl::read(this=0x000000000501ff30, is=shared_ptr<std::basic_istream<char, std::char_traits<char> > > @ 0x00007ffcdab7c8e0, msd=0x0000000004acc510) const + 107 at Serializer_mzML.cpp:223
    frame #6: 0x00007f2b01e57b2a mzR.so`pwiz::msdata::Serializer_mzML::read(this=<unavailable>, is=<unavailable>, msd=<unavailable>) const + 58 at Serializer_mzML.cpp:250
    frame #7: 0x00007f2b01e43964 mzR.so`pwiz::msdata::Reader_mzML::read(this=<unavailable>, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head=<unavailable>, result=0x0000000004acc510, runIndex=<unavailable>, config=<unavailable>) const + 948 at DefaultReaderList.cpp:148
    frame #8: 0x00007f2b01e5d855 mzR.so`pwiz::msdata::ReaderList::read(this=0x00000000058e1170, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=", result=0x0000000004acc510, sampleIndex=0, config=0x00007ffcdab7cb10) const + 181 at Reader.cpp:101
    frame #9: 0x00007f2b01eea60f mzR.so`pwiz::msdata::(anonymous namespace)::(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", msd=0x0000000004acc510, reader=0x00000000058e1170, head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=")(const string &const, pwiz::msdata::MSData &const, const pwiz::msdata::Reader &const, const string &const) + 127 at MSDataFile.cpp:61
    frame #10: 0x00007f2b01eec0ba mzR.so`pwiz::msdata::MSDataFile::MSDataFile(this=0x0000000004acc510, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", reader=<unavailable>, calculateSourceFileChecksum=<unavailable>) + 218 at MSDataFile.cpp:91
    frame #11: 0x00007f2b01ee434a mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [inlined] pwiz::msdata::RAMPAdapter::Impl::Impl(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", this=0x0000000004acc510) + 5 at RAMPAdapter.cpp:49
    frame #12: 0x00007f2b01ee4345 mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(this=0x00000000050b8110, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 37 at RAMPAdapter.cpp:296
    frame #13: 0x00007f2b01d3df7e mzR.so`rampOpenFile(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 542 at ramp.cpp:284
    frame #14: 0x00007f2b01d3cd15 mzR.so`cRamp::cRamp(this=0x00000000050e4ce0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 149 at cramp.cpp:75
    frame #15: 0x00007f2b01d45af8 mzR.so`RcppRamp::open(this=0x0000000004fd0ec0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 72 at RcppRamp.cpp:23
    frame #16: 0x00007f2b01d5a955 mzR.so`Rcpp::CppMethod2<RcppRamp, void, char const*, bool>::operator(this=0x00000000050c9990, object=0x0000000004fd0ec0, args=<unavailable>)(RcppRamp*, SEXPREC**) + 245 at Module_generated_CppMethod.h:215
    frame #17: 0x00007f2b01d570f0 mzR.so`Rcpp::class_<RcppRamp>::invoke_void(this=<unavailable>, method_xp=<unavailable>, object=0x0000000005cbe020, args=0x00007ffcdab7d2f0, nargs=<unavailable>) + 176 at class.h:212
    frame #18: 0x00007f2b027e3f41 Rcpp.so`CppMethod__invoke_void(args=<unavailable>) + 449 at Module.cpp:200
    frame #19: 0x00007f2b26a6e9b1 libR.so`do_External(call=0x000000000570f208, op=0x000000000333ac20, args=0x0000000005dd50b8, env=0x0000000005dd50f0) + 337 at dotcode.c:548
    frame #20: 0x00007f2b26aa86df libR.so`Rf_eval(e=0x000000000570f208, rho=0x0000000005dd50f0) + 1871 at eval.c:713
    frame #21: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x000000000570f198, op=0x00000000033254d8, args=0x000000000570f358, rho=0x0000000005dd50f0) + 344 at eval.c:1807
    frame #22: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x0000000005dd50f0) + 1345 at eval.c:685
    frame #23: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
    frame #24: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x00000000062d0ad8, rho=0x000000000595b3e0) + 797 at eval.c:732
    frame #25: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d0d08, op=0x00000000033254d8, args=0x00000000062d0b10, rho=0x000000000595b3e0) + 344 at eval.c:1807
    frame #26: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #27: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #28: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d04a0, op=0x00000000033254d8, args=0x00000000062d0e90, rho=0x000000000595b3e0) + 344 at eval.c:1807
    frame #29: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #30: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
    frame #31: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x000000000595a9f0, rho=0x000000000334ff58) + 797 at eval.c:732
    frame #32: 0x00007f2b26aabf66 libR.so`do_set(call=0x000000000595b680, op=0x000000000330c2d8, args=<unavailable>, rho=0x000000000334ff58) + 166 at eval.c:2197
    frame #33: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000334ff58) + 1345 at eval.c:685
    frame #34: 0x00007f2b26acf932 libR.so`Rf_ReplIteration(rho=0x000000000334ff58, savestack=<unavailable>, browselevel=<unavailable>, state=0x00007ffcdab7f020) + 546 at main.c:258
    frame #35: 0x00007f2b26acfca1 libR.so`R_ReplConsole(rho=0x000000000334ff58, savestack=0, browselevel=0) + 129 at main.c:308
    frame #36: 0x00007f2b26acfd58 libR.so`run_Rmainloop + 72 at main.c:1059
    frame #37: 0x0000000000d85dc2 rsession`rstudio::r::session::runEmbeddedR(rstudio::core::FilePath const&, rstudio::core::FilePath const&, bool, bool, SA_TYPE, rstudio::r::session::Callbacks const&, rstudio::r::session::InternalCallbacks*) + 434
    frame #38: 0x0000000000d6791d rsession`rstudio::r::session::run(rstudio::r::session::ROptions const&, rstudio::r::session::RCallbacks const&) + 9581
    frame #39: 0x00000000006a7299 rsession`main + 10809
    frame #40: 0x00007f2b2530e830 libc.so.6`__libc_start_main(main=(rsession`main), argc=11, argv=0x00007ffcdab81fd8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcdab81fc8) + 240 at libc-start.c:291
    frame #41: 0x00000000006b4fe1 rsession`_start + 41

这绝对看起来像Proteowizard方面的内存问题,在基础R进程中会以某种方式被忽略。

1 个答案:

答案 0 :(得分:3)

当使用使用消毒剂编译的R版本运行时,我看到:

> source("msdata.R", echo = TRUE)

> library(mzR)
Loading required package: Rcpp

> library(msdata)

> mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
+                      package = "msdata")

> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in

> mzml <- system.file("microtofq/MM8.mzML", package = "msdata")

> bb <- openMSfile(mzml) # this crashes R, but only in RStudio

特别是这一点:

> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in

意味着openMSfile()函数可能出现问题 - 它似乎试图以无效的偏移量读取数据。我会向mzR维护者提交此问题。

> sessionInfo()
R Under development (unstable) (2017-01-17 r72002)
Platform: x86_64-apple-darwin16.3.0 (64-bit)
Running under: macOS Sierra 10.12.2

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] msdata_0.15.0        mzR_2.9.3            Rcpp_0.12.9
[4] testthat_1.0.2       rmarkdown_1.3        knitr_1.15.1
[7] roxygen2_5.0.1       devtools_1.12.0.9000

loaded via a namespace (and not attached):
 [1] magrittr_1.5        BiocGenerics_0.21.3 pkgload_0.0.0.9000
 [4] R6_2.2.0            stringr_1.1.0       tools_3.4.0
 [7] pkgbuild_0.0.0.9000 parallel_3.4.0      Biobase_2.35.0
[10] withr_1.0.2         htmltools_0.3.5     ProtGenerics_1.7.0
[13] rprojroot_1.2       digest_0.6.11       crayon_1.3.2
[16] codetools_0.2-15    memoise_1.0.0       evaluate_0.10
[19] stringi_1.1.2       compiler_3.4.0      backports_1.0.5

编辑:这是用于构建R的配置调用;所有构建的包也使用相同的编译器+标志。 (摘自R.home("etc/Makeconf")

# R was configured using the following call
# (not including env. vars and site configuration)
# configure  '--with-blas=-L/usr/local/opt/openblas/lib -lopenblas' '--with-lapack=-L/usr/local/opt/lapack/lib -llapack' '--with-cairo' '--disable-R-framework' '--enable-R-shlib' '--with-readline' '--enable-R-profiling' '--enable-memory-profiling' '--with-valgrind-instrumentation=2' '--without-internal-tzcode' '--prefix=/Users/kevin/r/r-devel-sanitizers' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig'

CC = clang-3.9 -std=gnu99 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
CXX = clang++-3.9 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero 

EDIT V2:鉴于lldb堆栈跟踪,看起来罪魁祸首可能确实是Boost版本之间的冲突(RStudio使用的捆绑版本与mzR使用的版本)。请注意,mzR现在无意中调用了rsession可执行文件中的Boost例程,当它可能打算调用自己的捆绑版本时。