我设法找到一个匹配特定数据的漂亮模式,但我正在寻找在Regex.Replace方法中使用的反模式来清理无用的数据。
原始字符串:
<h3>Non-Human Toxicity Values:</h3>
<br />LD50 Rat oral 100 mg/kg /SRP: percent solution not specified/<br /><br />LD50 Rat (albino) oral 2020 mg/kg /From table/ /SRP: percent solution not specified/<br /><br />LD50 Rat oral 800 mg/kg /from table/<br /><br />LD50 Rat sc 420 mg/kg<br /><br />LC50 Rat inhalation 0.82 mg/L (1/2 hour)<br /><br />LC50 Rat inhalation 0.48 mg/L/4 hr<br /><br />LD50 Rat iv 87 mg/kg /Source contains no data on purity of the compound/<br /><br />LD50 Mouse oral 42 mg/kg /Source contains no data on purity of the compound/<br /><br />LD50 Mouse sc 300 mg/kg /Source contains no data on purity of the compound/<br /><br />LC50 Mouse inhalation 400 mg/cu m/2 hr /Source contains no data on purity of the compound/<br /><br />LC50 Mouse inhalation 0.414 mg/L/4 hr<br /><br />LD50 Mouse ip 16 mg/kg /From table/<br /><br />LD50 Guinea pig oral 260 mg/kg /Source contains no data on purity of the compound/<br /><br />LD50 Rabbit percutaneous 270 mg/kg /<font color="red"><strong>Formalin</strong></font>/<br /><br />LD50 Rabbit sc 240 mg/kg /From table/<br /><br />LD50 Dog sc 550 mg/kg /From table/<br /><br />
我需要的只是鼠和兔的价值观。
我使用((LD|LC)50 (Rat)|(Rabbit)).*?(/kg|/L/l)
来匹配这些值,但我想要一个方法,以便替换不是特定模式的任何内容。
我在其他线程中环顾四周,但解决方案是特定字符类型(数字,非数字,非单词等)的排他性。我在这里寻找一种模式。
答案 0 :(得分:1)
当你运行正则表达式匹配时 - 你可以采取匹配组并按照你想要的方式组合它们 例如:
string input = @"<h3>Non-Human Toxicity Values:</h3>
<br />LD50 Rat oral 100 mg/kg /SRP: percent solution not specified/
<br /><br />LD50 Rat (albino) oral 2020 mg/kg /From table/ /SRP: percent solution not specified/
<br /><br />LD50 Rat oral 800 mg/kg /from table/
<br /><br />LD50 Rat sc 420 mg/kg
<br /><br />LC50 Rat inhalation 0.82 mg/L (1/2 hour)
<br /><br />LC50 Rat inhalation 0.48 mg/L/4 hr
<br /><br />LD50 Rat iv 87 mg/kg /Source contains no data on purity of the compound/
<br /><br />LD50 Mouse oral 42 mg/kg /Source contains no data on purity of the compound/
<br /><br />LD50 Mouse sc 300 mg/kg /Source contains no data on purity of the compound/
<br /><br />LC50 Mouse inhalation 400 mg/cu m/2 hr /Source contains no data on purity of the compound/
<br /><br />LC50 Mouse inhalation 0.414 mg/L/4 hr
<br /><br />LD50 Mouse ip 16 mg/kg /From table/
<br /><br />LD50 Guinea pig oral 260 mg/kg /Source contains no data on purity of the compound/
<br /><br />LD50 Rabbit percutaneous 270 mg/kg /<font color=""red""><strong>Formalin</strong></font>/
< br />< br /> LD50 Rabbit sc 240 mg / kg / From table /
< br />< br /> LD50 Dog sc 550 mg / kg / From table /
< br />< br /> ";
Regex ratRabbitRegex = new Regex(@"(?<Pattern>(LD|LC)\d\d (Rat|Rabbit) (\w|\s|/)+)");
var matches = ratRabbitRegex.Matches(input);
var result = new List<string>();
for (int id = 0; id < matches.Count; id++)
{
result.Add(matches[id].Groups["Pattern"].Value);
}
现在,您可以根据需要格式化结果集合中的字符串。