从字符串中删除字符(在R中)

时间:2017-10-24 08:58:25

标签: r regex string

我是这个社区的新手,我想问一下(我没有找到任何可以帮助我的问题)。

我有这个字符串:

{name:GTP hydrolysis and joining of the 60S ribosomal subunit,description:Hydrolysis of eIF2-GTP occurs after the Met-tRNAi has recognized the AUG. This reaction is catalyzed by eIF5 (or eIF5B) and is thought to cause dissociation of all other initiation factors and allow joining of the large 60S ribosomal subunit. The 60S subunit joins - a reaction catalyzed by eIF5 or eIF5B - resulting in a translation-competent 80S ribosome. Following 60S subunit joining, eIF5B hydrolyzes its GTP and is released from the 80S ribosome, which is now ready to start elongating the polypeptide chain.,url:https://reactome.org/PathwayBrowser/#/R-HSA-72706,sameAs:null,version:62,keywords:[Pathway],creator:[],includedInDataCatalog:{url:https://reactome.org,name:Reactome,@type:DataCatalog},distribution:[{contentUrl:https://reactome.org/ContentService/exporter/sbml/72706.xml,fileFormat:SBML,@type:DataDownload},{contentUrl:https://reactome.org/ReactomeRESTfulAPI/RESTfulWS/sbgnExporter/72706,fileFor... <truncated>

这是非常混乱的,我想在单词描述之前删除所有字符。所以它最终会像这样:

description:Hydrolysis of eIF2-GTP occurs after the Met-tRNAi has recognized the AUG. This reaction is catalyzed by eIF5 (or eIF5B) and is thought to cause dissociation of all other initiation factors and allow joining of the large 60S ribosomal subunit. The 60S subunit joins - a reaction catalyzed by eIF5 or eIF5B - resulting in a translation-competent 80S ribosome. Following 60S subunit joining, eIF5B hydrolyzes its GTP and is released from the 80S ribosome, which is now ready to start elongating the polypeptide chain.,url:https://reactome.org/PathwayBrowser/#/R-HSA-72706,sameAs:null,version:62,keywords:[Pathway],creator:[],includedInDataCatalog:{url:https://reactome.org,name:Reactome,@type:DataCatalog},distribution:[{contentUrl:https://reactome.org/ContentService/exporter/sbml/72706.xml,fileFormat:SBML,@type:DataDownload},{contentUrl:https://reactome.org/ReactomeRESTfulAPI/RESTfulWS/sbgnExporter/72706,fileFor... <truncated>

提前致谢!

3 个答案:

答案 0 :(得分:3)

你应该使用reg ex方法,这样你就可以处理不同数量的主角:

a <- "{name:GTP hydrolysis and joining of the 60S ribosomal subunit,description:Hydrolysis of eIF2-GTP occurs after the Met-tRNAi has recognized the AUG. This reaction is catalyzed by eIF5 (or eIF5B) and is thought to cause dissociation of all other initiation factors and allow joining of the large 60S ribosomal subunit. The 60S subunit joins - a reaction catalyzed by eIF5 or eIF5B - resulting in a translation-competent 80S ribosome. Following 60S subunit joining, eIF5B hydrolyzes its GTP and is released from the 80S ribosome, which is now ready to start elongating the polypeptide chain.,url:https://reactome.org/PathwayBrowser/#/R-HSA-72706,sameAs:null,version:62,keywords:[Pathway],creator:[],includedInDataCatalog:{url:https://reactome.org,name:Reactome,@type:DataCatalog},distribution:[{contentUrl:https://reactome.org/ContentService/exporter/sbml/72706.xml,fileFormat:SBML,@type:DataDownload},{contentUrl:https://reactome.org/ReactomeRESTfulAPI/RESTfulWS/sbgnExporter/72706,fileFor..."

gsub('(.*)description:','', a)

答案 1 :(得分:3)

您可以使用str_extract

中的stringr
library(stringr)
str_extract(text, "description:(?s)(.*$)")

"description:Hydrolysis of eIF2-GTP occurs after the ...

答案 2 :(得分:0)

这个怎么样?

library(stringr)
yourData$yourColumn <- str_sub(yourData$yourColumn, start=62)  # hope I've counted right!