请找一个我的字符串示例:
<s id="1">Here we show that <ANAPH id="535" biotype="partof_product">the approximately 600-amino acid; region</ANAPH> something somethingelse .</s>
所需的功能是通过移除尖括号括起的序列(包括尖括号)来清除字符串。因此,对于上面的示例字符串,所需的输出将是:
Here we show that the approximately 600-amino acid; region something somethingelse .
对于正则表达式= \&lt; {1}。* \&gt; {1}并使用replaceAll函数,整行将被替换;我理解为什么会这样。有人能指出一种更具体地使用正则表达式来表达模式的方法,以获得所需的输出吗?
谢谢。
EDIT1:
是的,使用Kassym Dorsel建议的正则表达式来处理上面的字符串
但是,对于以下字符串:
<s id="7"><ANAPH id="100216" biotype="supertype" assoc_ante="48275" assoc_rel="set-member" coref_chain="set_234">The C. elegans genome sequence</ANAPH> was completed two years ago [ 1 ] , and both the Drosophila [ 2 ] and human genomes are essentially completely sequenced at this point .</s>
使用正则表达式的输出如下:
<ANAPH id="100216" biotype="supertype" assoc_ante="48275" assoc_rel="set-member" coref_chain="set_234">The C. elegans genome sequence</ANAPH> was completed two years ago [ 1 ] , and both the Drosophila [ 2 ] and human genomes are essentially completely sequenced at this point .</s>
所需的输出是:
The C. elegans genome sequence was completed two years ago [ 1 ] , and both the Drosophila [ 2 ] and human genomes are essentially completely sequenced at this point .
你能帮助我概括正则表达式吗?
答案 0 :(得分:4)
鉴于此:<s id="1">Here we show that <ANAPH id="535" biotype="partof_product">the approximately 600-amino acid; region</ANAPH> something somethingelse .</s>
使用此<[^>]*?>
并替换为空白即可:
Here we show that the approximately 600-amino acid; region something somethingelse .