维护文本但消除标签之间的CR LF

时间:2011-05-18 14:50:37

标签: javascript regex vbscript flat-file ecma262

研究员,

我有一个包含以下表达式的平面文件:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY
WHERE IS_SPREAD_OVER == 123
ORDER BY MULTIPLE_LINES
HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

我想消除引号和引号本身之间的CRLF ,以便我的所有查询都是方便的单行代码:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING BUT_IS_BETWEEN_QUOTES
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

请发布解决方案中使用的RegEx风味。我正在使用TextCrawler,它声称是ECMA262(与VBScript / Javascript相同),而我最接近解决方案的是:

(\r\n".*)(.*)\r\n(.*"\r\n)

原谅我的无聊。 最好的祝福, Lynx Kepler

2 个答案:

答案 0 :(得分:2)

如果下一个"位于一行末尾,您可以先删除所有CRLF:

result = subject.replace(/\r\n(?=[^"]*"$)/mg, " ");

说明:

\r\n    # Match a CRLF
(?=     # if and only if
 [^"]*  # it is followed by any number of non-quote characters
 "      # and a quote
 $      # at the end of a line
)       # End of lookahead.

这会将您的示例转换为

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

enter image description here

然后,在第二步中,删除引号:

result = subject.replace(/^"|"$/mg, "");

答案 1 :(得分:0)

使用Perl,您可以执行以下操作:

s/^"([^"]*)"$/$s = $1; $s =~ s!(?:\n|\r)+! !g; $s/meg