我有以下数据(在一行中):
<span id="ctb_0" onclick="show_hide_box(this);"
class="hide_icon r txtfont ltr">open</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Rayyan Real Investment</font>,
<span class="ltr txtfont">+92-3212459990</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Bukhari Properties</font>,
<span class="ltr txtfont">+92-3218248858</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Exact Properties</font>,
<span class="ltr txtfont">+92-3312044421</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Exact Properties</font>,
<span class="ltr txtfont">+92-3312044421</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Adeel Corporation</font>,
<span class="ltr txtfont">+923008253132</span>
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Adeel Corporation</font>,
<span class="ltr txtfont">+92-3008253132</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Z.S Associates</font>,
<span class="ltr txtfont">+92-3452431417</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Keystone Properties</font>,
<span class="ltr txtfont">+92-3353509187/301..</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Adeel Corporation</font>,
<span class="ltr txtfont">+92-3008253132</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Adeel Corporation</font>,
<span class="ltr txtfont">+92-3008253132</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Safeway Real Estate Consultant</font>,
<span class="ltr txtfont">+92-3218282885/345..</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Abdul Sattar & Sons</font>,
<span class="ltr txtfont">+92-3332107802, +9..</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Bismillah Real Estate</font>,
<span class="ltr txtfont">+92-3213336525, 03..</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Aiman Estate & Properties</font>,
<span class="ltr txtfont">+92-3212537535</span>,
<div class="description clr ltr txtfont">…</div>,
<font class="txtfont ltr">Aiman Estate & Properties</font>,
<span class="ltr txtfont">+92-3212537535</span>,
在记事本++ 中使用正则表达式我想要这样:
923008929845
923318874928
923008275080
923452113010
923002024486
923218286664
923218286664
923212804245
923002555091
923212804245
923008289996
923003579717
923003579717
923003772227
923007048836
我在记事本++中尝试过以下但是它不干净而且快速。我正在手动删除HTML代码,这阻止我快速完成数据抓取
找到: [a-z] | [A-Z] | [,。()_ =;“+&lt;&gt; /: - ]
替换为:(空格键)
仍然看到很多随机字符
答案 0 :(得分:1)
怎么样:
找到:^.*?\+(\d\d)-(\d{10}).*?$
替换为:$1$2\n
<强>解释强>
^ : begining of line
.*? : 0 or more any character (not greedy)
\+ : +, needs to be escaped because it's a special char for regex
(\d\d) : 2 digits captured in group 1
- : dash
(\d{10}) : 10 digits captured in group 2
.*? : 0 or more any character (not greedy)
$ : end of line
答案 1 :(得分:0)
试试这个。
查找内容:\s.*\s.*?(\d+)-(\d{10})|.+
替换为:$1$2
注意!!”
这是我目前从正则表达式学到的东西,我不擅长
正则表达式,但上面的正则表达式工作正常,除了数字之间留有 2 个空格....
答案 2 :(得分:-1)