我有一个下面的降价文件:
#2016-12-24
| 单词 | 解释 | 例句 |
| --------- | -------- | --------- |
|**accelerator;**| - | - |
|**compass**| - | - |
|**wheels**| - | - |
|**fabulous**| - | - |
|**sweeping**| - | - |
|**prospect**| - | - |
|**pumpkin**| - | - |
|**trolley**| - | - |
|snapped,**| - | - |
|tip| - | - |
|lap| - | - |
|tether.| - | - |
|damp| - | - |
|triumphant| - | - |
|sarcastic| - | - |
|missed out| - | - |
|sidekick| - | - |
|considerable| - | - |
|Willow.| - | - |
|eagle.| - | - |
|considerably.| - | - |
|flat.| - | - |
|feast| - | - |
|scramble| - | - |
|turned up| - | - |
|rounded off| - | - |
|rat| - | - |
|resembled| - | - |
|By the time she had clambered back into the car,| - | - |
|By the time she had clambered back into the car, they were running very late,| - | - |
|wheeled his trolley| - | - |
|barrier,| - | - |
|bounced| - | - |
|in blazes| - | - |
|clutching| - | - |
|sealed| - | - |
|stunned.| - | - |
|‘We’re stuck,| - | - |
|marched off| - | - |
|accelerator| - | - |
|and the prospect of seeing Fred and George’s jealous faces| - | - |
|protest.| - | - |
|in protest.| - | - |
|horizon,| - | - |
|knuckles| - | - |
|metal| - | - |
|thick| - | - |
|reached the end of its tether.| - | - |
|Artefacts| - | - |
|blurted out.| - | - |
|gaped| - | - |
|I will be writing to both your families tonight.| - | - |
|‘Can you believe our luck, though?’| - | - |
|‘Skip the lecture,’| - | - |
|people’ll be talking about that one for years!’| - | - |
|nudged| - | - |
|‘I know I shouldn’t’ve enjoyed that or anything, but –’| - | - |
|dashed| - | - |
我想提取像:
这样的句子我试图在regex101网站上这样做,但实际上每次都匹配所有。
任何人都可以帮助我吗?
答案 0 :(得分:1)
试试这个:
^\|[^\w\|]*(\w+\s+(?=\w+)[^\|]*)
^\|
匹配
[^\w\|]*
抓住任何不在a-z0-9和| \w+\s+
确保后跟一个单词和一个或多个单词
白色空间(?=\w+)
然后检查是否有更多要关注的字词[^\|]*
如果找到先前的条件,那么抓住任何东西直到
下一个管道| 对于每场比赛,第1组包含您想要的句子
答案 1 :(得分:0)
你可以提出:
^\| # start of line, followed by |
( # capture the "words"
(?:[‘\w]+ # a non-capturing group and at least one of \w or ‘
(?:[^|\w\n\r]+ # followed by NOT one of these
| # or
(?=\|) # make sure, there's a | straight ahead
)
){2,}) # repeat the construct at least 2 times
\|
请参阅a demo on regex101.com(并注意修饰符!)
这将至少捕获两个连续的字,如果您需要更多,请在{}
括号中添加另一个数字。