检索字符串的一部分

时间:2011-06-17 13:21:20

标签: c# .net winforms substring

我使用Html Agility Pack解析html页面。我成功地在字符串中获取以下文本:

 
WOCN11 CWTO 170951

Special weather statement

Updated by Environment Canada

At 5:51 AM EDT Friday 17 June 2011.



Special weather statement issued for..

Sarnia - Lambton

London - Middlesex

Oxford - Brant

Waterloo - Wellington.

---------------------------------------------------------------------

Dense fog patches with near zero visibility have been reported in

The above areas. Extra caution is urged for travellers in these 

areas.



Fog is expected to lift shortly after sunrise this morning.



END/OSPC

ACCN10 CWTO 170735

Forecast of thunderstorm potential for the province of Ontario

Issued by Environment Canada at 3:35 AM EDT Friday 17 June 2011.

The next statement will be issued at 4.30 PM today.

---------------------------------------------------------------------

Forecast of thunderstorm potential.



Today..Isolated non severe thunderstorms over eastern

And Northeastern Ontario.



Tonight..Isolated non severe thunderstorms over eastern and 

Northeastern Ontario this evening.



Saturday..Isolated non severe thunderstorms over extreme

Southwestern Ontario mainly late in the afternoon and evening.



---------------------------------------------------------------------

A thunderstorm is defined as severe if it produces one or more of the 

following:



 - wind gusts of 90 km/h or greater.

 - hail of 2 centimetres in diameter or greater.

 - rainfall amounts of 50 millimetres or greater in one hour or less.

 - a tornado.



Note: this forecast is issued twice daily from May 1 to September 30.



END/OSPC

我想只提取以下部分:


Forecast of thunderstorm potential.



Today..Isolated non severe thunderstorms over eastern

And Northeastern Ontario.



Tonight..Isolated non severe thunderstorms over eastern and 

Northeastern Ontario this evening.



Saturday..Isolated non severe thunderstorms over extreme

Southwestern Ontario mainly late in the afternoon and evening.



我在.Net 3.5上使用Csharp。任何帮助表示赞赏。

问题已更新

4 个答案:

答案 0 :(得分:3)

你可以做到的一种方式(虽然不是100%理想),是这样的:

string[] textSplit = theWholeTextString.Split(new string[] { "---------------------------------------------------------------------" }, StringSplitOptions.None);
string myText = textSplit[2];

当然假设您想要的文本总是在第3部分,并且每个部分总是以'------'行分隔

答案 1 :(得分:0)

为了让我们能够帮助您,您需要告诉我们如何定义要保留的文本。这是一行'---'+'预测'直到最后'---'行还是别的东西等等......一个regExp会完成这项工作,但确切的语法我无法分辨没有更多信息。

答案 2 :(得分:0)

如果您认为只有-------------行之间的内容符合您的要求,请尝试使用此正则表达式:-{40,}([\s\S](?=-{40,}))-{40,}

Regex.Match(report, @"-{40,}([\s\S](?=-{40,}))-{40,}").Value

答案 3 :(得分:0)

看起来唯一分隔文字的是------------------------------------- -------------------------------- characters。

如何使用string.Split()。这是一个例子:

string[] textArray = wholeText.Split(new string[] {"---------------------------------------------------------------------"}, StringSplitOptions.RemoveEmptyEntries);

string text = textArray[2];