我正在尝试删除字符串中双引号之间的逗号,同时保留其他逗号完整无缺? (这是一个有时包含备用逗号的电子邮件地址)。以下"蛮力"代码在我的特定机器上运行正常,但有更优雅的方式来做,也许只有一个正则表达式? 邓肯
$string = '06/14/2015,19:13:51,"Mrs, Nkoli,,,ka N,ebedo,,m" <ubabankoffice93@gmail.com>,1,2';
print "Initial string = ", $string, "<br>\n";
# Extract stuff between the quotes
$string =~ /\"(.*?)\"/;
$name = $1;
print "name = ", $1, "<br>\n";
# Delete all commas between the quotes
$name =~ s/,//g;
print "name minus commas = ", $name, "<br>\n";
# Put the modified name back between the quotes
$string =~ s/\"(.*?)\"/\"$name\"/;
print "new string = ", $string, "<br>\n";
答案 0 :(得分:2)
您可以使用这种模式:
$string =~ s/(?:\G(?!\A)|[^"]*")[^",]*\K(?:,|"(*SKIP)(*FAIL))//g;
模式细节:
(?: # two possible beginnings:
\G(?!\A) # contiguous to the previous match
| # OR
[^"]*" # all characters until an opening quote
)
[^",]* #"# all that is not a quote or a comma
\K # discard all previous characters from the match result
(?: # two possible cases:
, # a comma is found, so it will be replaced
| # OR
"(*SKIP)(*FAIL) #"# when the closing quote is reached, make the pattern fail
# and force the regex engine to not retry previous positions.
)
如果您使用较旧的perl版本,\K
可能不支持回溯控制动词。在这种情况下,您可以将此模式与捕获组一起使用:
$string =~ s/((?:\G(?!\A)|[^"]*")[^",]*)(?:,|("[^"]*(?:"|\z)))/$1$2/g;
答案 1 :(得分:2)
一种方法是使用nice模块Text::ParseWords
来隔离特定字段并执行简单的音译以删除逗号:
06/14/2015,19:13:51,"Mrs Nkolika Nebedom" <ubabankoffice93@gmail.com>,1,2
<强>输出:强>
id = re.search('(-d)([0-9]+)',url).group(2)
我认为在您的电子邮件字段中没有逗号可以合法显示。否则需要一些其他替换方法。