我在Ruby中有一个类似于以下内容的字符串:
{
"a boolean": true,
"multiline": "
my
multiline
value
",
"a normal key": "a normal value"
}
我只想匹配子字符串中的换行符:
"
my
multiline
value
",
因此,我可以将它们替换为转义的换行符。从长远来看,这样做的目的是使JSON易于使用。
答案 0 :(得分:2)
更新 -这些正则表达式可以正常工作。
来自@faissaloo-it seemed to fail however on my large
JSON。
我使用两个正则表达式都运行了这个大字符串:
PCRE https://regex101.com/r/3jtqea/1
Ruby https://regex101.com/r/1HVCCC/1
它们都工作相同,没有缺陷。
如果您还有其他疑问,请告诉我。
我认为Ruby支持类似Perl的构造。
如果是这样,可以在单个全局查找和替换中完成。
像这样:
编辑 -Ruby不会回溯控制动词(*SKIP)(*FAIL)
因此,要在Ruby代码中执行此操作,需要正则表达式更加明确。
因此,对pcre / perl regex进行一些修改后,Ruby等效项为:
Ruby
查找
(?-m)((?!\A)\G|(?:(?>[^"]*"[^"\r\n]*"[^"]*))*")([^"\r\n]*)\K\r?\n(?=[^"]*")((?:[^"\r\n]*"(?:(?>[^"]*"[^"\r\n]*"))*[^"]*)?)
替换
\\n\3
https://regex101.com/r/BaqjEE/1
https://rextester.com/NVFD38349
解释(但很复杂)
(?-m) # Non-multiline mode safety check
( # (1 start), Prefix. Capture for debug
(?! \A ) # Not BOS
\G # Test where last match left off
| # or,
(?: # Optionally align to next " ( only used once )
(?> [^"]* " [^"\r\n]* " [^"]* )
)*
" # A new quote to test
) # (1 end)
( [^"\r\n]* ) # (2), Line break Preamble. Capture for debug
\K # Exclude from the match (group 0) up to this point
\r? \n # Line break to escape
(?= [^"]* " ) # Validate we have " closure
( # (3 start), Optional end quote and alignment.
# To be written back.
(?:
[^"\r\n]* "
(?: # Optionally align to next "
(?> [^"]* " [^"\r\n]* " )
)*
[^"]*
)?
) # (3 end)
# Ruby Code:
#----------------------
# #ruby 2.3.1
#
# re = /(?-m)((?!\A)\G|(?:(?>[^"]*"[^"\r\n]*"[^"]*))*")([^"\r\n]*)\K\r?\n(?=[^"]*")((?:[^"\r\n]*"(?:(?>[^"]*"[^"\r\n]*"))*[^"]*)?)/
# str = '{
# "a boolean": true,
# "a boolean": true,
# "a boolean": true,
# "a boolean": true,
# "multiline": "
# my
# multiline
# value
# asdf"
# ,
#
# "a multiline boo
# lean": true,
# "a normal key": "a multiline
#
# value"
# }'
# subst = '\\n\3'
#
# result = str.gsub(re, subst)
#
# # Print the result of the substitution
# puts result
对于Pcre / Perl
查找
(?:((?:(?>[^"]*"[^"\n]*"[^"]*))+(*SKIP)(*FAIL)|"|(?!^)\G)([^"\n]*)\K\n(?=[^"]*")((?:[^"\n]*")?))
替换
\\n$3
https://regex101.com/r/06naae/1
解释(但很复杂)
请注意,如果您在编辑器需要CRLF中断的窗口框中,
像这样\r
在LF前面添加一个\r\n
。
(?:
( # (1 start), Prefix capture, for debug
(?:
(?> [^"]* " [^"\n]* " [^"]* )
)+
(*SKIP) (*FAIL) # Consume false positives, but ignore them
# (need this to align next ")
| # or,
" # A new quote to test
| # or,
(?! ^ ) # Not BOS
\G # Test where last match left off
) # (1 end)
( [^"\n]* ) # (2), Preamble capture, for debug
\K # Exclude from the match (group 0) up to this point
\n # Line break to escape
(?= [^"]* " ) # Validate we have " closure
( # (3 start), End quote, to be written back
(?: [^"\n]* " )?
) # (3 end)
)
答案 1 :(得分:1)
答案 2 :(得分:0)
另一个选择是这样的:
string = '{
"a boolean": true,
"multiline": "my
multiline
value",
"a normal value"
}'
puts string.match(/"(\w+)(\n+\w*)+"/).to_s.gsub!("\n", '\n')
这与您字符串中的正则表达式匹配,然后用转义的换行符替换换行符。
答案 3 :(得分:0)
答案 4 :(得分:0)
如果您的多行字符串不包含逗号(在换行符之前),则可以在json中使用,每行必须以,
,{
或[
结尾否则下一行必须以}
或]
开头:
json_string.gsub(/(?<!,|\{|\[)\n(?!\s*[}\]])/, '\n')
如果字符串(或大括号和方括号)中包含逗号,则可以通过向有效行尾列表中添加更多详细信息来改进此方法:
valid_line_ends = %w(true, false, ", }, ], { [)
line_end_matcher = valid_line_ends.map(&Regexp.method(:escape)).join('|')
json_string.gsub(/(?<!#{line_end_matcher})\n(?!\s*[}\]])/, '\n')