研究员,
我有一个包含以下表达式的平面文件:
SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY
WHERE IS_SPREAD_OVER == 123
ORDER BY MULTIPLE_LINES
HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER
我想消除引号和引号本身之间的CRLF ,以便我的所有查询都是方便的单行代码:
SELECT * FROM CONVENIENT_ONE_LINE_QUERY
SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING BUT_IS_BETWEEN_QUOTES
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER
请发布解决方案中使用的RegEx风味。我正在使用TextCrawler,它声称是ECMA262(与VBScript / Javascript相同),而我最接近解决方案的是:
(\r\n".*)(.*)\r\n(.*"\r\n)
原谅我的无聊。 最好的祝福, Lynx Kepler
答案 0 :(得分:2)
如果下一个"
位于一行末尾,您可以先删除所有CRLF:
result = subject.replace(/\r\n(?=[^"]*"$)/mg, " ");
说明:
\r\n # Match a CRLF
(?= # if and only if
[^"]* # it is followed by any number of non-quote characters
" # and a quote
$ # at the end of a line
) # End of lookahead.
这会将您的示例转换为
SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER
然后,在第二步中,删除引号:
result = subject.replace(/^"|"$/mg, "");
答案 1 :(得分:0)
使用Perl,您可以执行以下操作:
s/^"([^"]*)"$/$s = $1; $s =~ s!(?:\n|\r)+! !g; $s/meg