Regex - find all lines after a match:å¯èƒ½é‡å¤ï¼Œä½†æˆ‘的需求略有ä¸åŒã€‚
我想解æžä¸€ä¸ªçº¯æ–‡æœ¬æ–‡ä»¶ï¼Œå…¶ä¸åŒ…å«ç”±ç‰¹å®šå—符串分隔的多个日期/值数æ®ã€‚我想跳过文件的å‰åŠéƒ¨åˆ†ï¼Œç›´åˆ°æˆ‘想è¦åŒ¹é…结果的特定行。
ä»¥ä¸‹æ˜¯ç›¸å…³æ–‡ä»¶çš„ç¤ºä¾‹ï¼ˆåŒ…æ‹¬è¡¨æ ¼å’Œç©ºæ ¼çš„æ··ä¹±ï¼‰ï¼š
I dont want to capture the following measures. This text is on a single line and contains tabs and spaces is also ends with this token : Token1
05/01/1969 0.01846
15/01/1969 0.16730
25/01/1969 0.33988
05/04/1969 0.81319
15/04/1969 0.76973
25/11/2011 0.24210
05/12/2011 0.25220
15/12/2011 0.31160
25/12/2011 0.36845
End : bla bla bla
This text is also on a single line and marks the beginning of a new series of results. These are the results that I want. it also ends with the following token : Token2
05/01/1969 109.46333
15/01/1969 110.06998 118.18000
25/01/1969 110.82954
05/02/1969 111.51394 118.83000
25/02/1969 112.36483
05/10/2011 114.38798 114.31000
05/10/2011 114.31000 114.38798 114.38798 114.38798 114.38798 114.38798 114.38798
25/12/2011 112.64000 112.41261 112.86301 113.25494 114.06421 115.93219 116.38780
05/01/2012 112.22834 112.92301 113.40561 114.78823 116.62931 117.43421
05/09/2012 110.01410 112.16391 112.88199 115.23640 117.04756 118.04632
15/09/2012 109.97572 112.00809 112.70266 114.91247 116.65256 117.57412
25/09/2012 109.93967 111.87272 112.53305 114.60381 116.26935 117.12756
End : Marks the end of the file
我希望åšçš„是匹é…åŽçš„æ¯ä¸€è¡Œä»¥Token2
结尾的行。我å°è¯•è¿‡å…¶ä»–类似问题的ä¸åŒè§£å†³æ–¹æ¡ˆä½†æ²¡æœ‰æ•ˆæžœã€‚我最终匹é…文件的所有结果,并在应用以下模å¼ä¹‹å‰è€ƒè™‘拆分它。有没有纯粹的æ£åˆ™è¡¨è¾¾å¼è§£å†³æ–¹æ¡ˆå‘¢ï¼Ÿ
这是适用于整个文件的模å¼ã€‚使用命åçš„æ•èŽ·ç»„:
(?P<date>\d\d\/\d\d\/\d\d\d\d)\s*(?P<simul>\d+\.*\d*)[\t ]*(?P<observ>\d+\.*\d*){0,1}[\t ]*(?P<prev_no_rain>\d+\.*\d*){0,1}[\t ]*(?P<prev_10_dry>\d+\.*\d*){0,1}[\t ]*(?P<prev_20_dry>\d+\.*\d*){0,1}[\t ]*(?P<prev_50>\d+\.*\d*){0,1}[\t ]*(?P<prev_20_wet>\d+\.*\d*){0,1}[\t ]*(?P<prev_10_wet>\d+\.*\d*){0,1}
Regex101链接:https://regex101.com/r/a0mCZ2/3
ç”案 0 :(得分:2)
您å¯ä»¥ä½¿ç”¨åŒ¹é…å—符串开头的\G
è¿ç®—符(å¯ä»¥ä½¿ç”¨è´Ÿé¢å¤–观排除)和上一个æˆåŠŸåŒ¹é…ä½ç½®çš„结尾。使用(?:\G(?!\A)|\bToken2[\r\n]+)
,我们å¯ä»¥å‘Šè¯‰æ£åˆ™è¡¨è¾¾å¼å¼•æ“Žåœ¨è¡Œå°¾æ‰¾åˆ°ä¸€ä¸ªå®Œæ•´çš„å•è¯Token2
(带有æ¢è¡Œç¬¦å·ï¼‰ï¼Œç„¶åŽåªæœ‰å½“它们紧éšå…¶åŽæ‰ä¼šæ‰¾åˆ°ä»¥ä¸‹å模å¼ã€‚ / p>
å¯ä»¥ä½¿ç”¨çš„æ£åˆ™è¡¨è¾¾å¼ï¼š
(?:\G(?!\A)[\r\n]*|Token2[\r\n]+)\K(?P<date>\d\d\/\d\d\/\d{4})\s*(?P<simul>\d+\.*\d*)[\t ]*(?P<observ>\d+\.*\d*)?[\t ]*(?P<prev_no_rain>\d+(?:\.\d+)*)?[\t ]*(?P<prev_10_dry>\d+\.*\d*)?[\t ]*(?P<prev_20_dry>\d+\.*\d*)?[\t ]*(?P<prev_50>\d+\.*\d*)?[\t ]*(?P<prev_20_wet>\d+\.*\d*)?[\t ]*(?P<prev_10_wet>\d+\.*\d*)?
请å‚阅regex demo。注æ„我将{0,1}
替æ¢ä¸º?
以缩çŸå®ƒã€‚
您感兴趣的部分是(?:\G(?!\A)[\r\n]*|Token2[\r\n]+)\K
。
(?:\G(?!\A)[\r\n]*|Token2[\r\n]+)
- 两ç§é€‰æ‹©ä¸çš„一ç§ï¼š
\G(?!\A)[\r\n]*
- 上一次æˆåŠŸæ¯”赛结æŸå’Œ0+æ¢è¡Œç¬¦å·|
- 或Token2[\r\n]+
- Token2
åŽè·Ÿ1 + CR或LF。 (如果您需è¦å°†Token2
作为整个è¯åŒ¹é…,则å¯ä»¥åœ¨å…¶å‰é¢æ·»åŠ \b
。\K
- çœç•¥åˆ°ç›®å‰ä¸ºæ¢åŒ¹é…çš„æ–‡å—。 (?P<date>\d\d\/\d\d\/\d{4})\s*(?P<simul>\d+\.*\d*)[\t ]*(?P<observ>\d+\.*\d*)?[\t ]*(?P<prev_no_rain>\d+(?:\.\d+)*)?[\t ]*(?P<prev_10_dry>\d+\.*\d*)?[\t ]*(?P<prev_20_dry>\d+\.*\d*)?[\t ]*(?P<prev_50>\d+\.*\d*)?[\t ]*(?P<prev_20_wet>\d+\.*\d*)?[\t ]*(?P<prev_10_wet>\d+\.*\d*)?
æ˜¯ä½ çš„æ¨¡å¼ï¼Œæˆ‘没有修改太多,并且匹é…具有特定fata的行(请注æ„,它与行匹é…的事实è¯æ˜Ž[\r\n]*
在{{1}之åŽçš„使用是æ£ç¡®çš„}})。