在相同的文本消息中固定一次(项目的ID)和多行(每个部分的几个参考和尺寸):
..some random text here..
ID/11000082734
REF/D14-109-0
REF/D14-209-0
REF/D14-219-0
CMT/59-40-25
CMT/38-25-28
CMT/59-40-25
CMT/37-37-20
CMT/40-40-20
CMT/37-37-20
CMT/49-41-31
CMT/44-34-53
我想解析并存储IdCode
,References
,Array with dimensions
。
应用REGEX.match(my_text)
方法时,仅获取REF
和CMT
的首次发生:
REGEX = %r{
ID\/(?<IdCode> \d{10})\s
(REF\/(?<ReferenceCode> \w{3}\-\d{3}\-\d)\s)+
(CMT\/(?<Length> \d+)\-(?<Width> \d+)\-(?<Height> \d+)\s)+
}x
结果如下:
IdCode: "1100008273"
ReferenceCode: "D14-219-0"
Length: "37"
Width: "37"
Height: "20"
有没有办法在不迭代的情况下捕获多个事件?
答案 0 :(得分:1)
假设你的字符串是:
str = %w| dog
ID/11000082734
REF/D14-109-0
REF/D14-209-0
CMT/49-41-31
CMT/44-34-53
cat
ID/11000082735
REF/D14-109-1
REF/D14-209-1
CMT/49-41-32
CMT/44-34-54
pig |.join("\n")
#=> "dog\nID/11000082734\nREF/D14-109-0\nREF/D14-209-0\nCMT/49-41-31\nCMT/44-34-53\ncat\nID/11000082735\nREF/D14-109-1\nREF/D14-209-1\nCMT/49-41-32\nCMT/44-34-54\npig"
然后你可以写:
r = /(ID\/\d{11}) # match string in capture group 1
\n # match newline
((?:REF\/[A-Z]\d{2}-\d{3}-\d\n)+) # match consecutive REF lines in capture group 2
((?:CMT\/\d{2}-\d{2}-\d{2}\n)+) # match consecutive CMT lines in capture group 3
/x # free-spacing regex definition mode
arr = str.scan(r)
#=> [["ID/11000082734", "REF/D14-109-0\nREF/D14-209-0\n",
# "CMT/49-41-31\nCMT/44-34-53\n"],
# ["ID/11000082735", "REF/D14-109-1\nREF/D14-209-1\n",
# "CMT/49-41-32\nCMT/44-34-54\n"]]
无需迭代即可提取所需信息。
此时可能需要将arr
转换为更方便的数据结构。例如:
arr.map do |a,b,c|
{ :id => a[/\d+/],
:ref => b.split("\n").map { |s| s[4..-1] },
:cmt => c.scan(/(\d{2})-(\d{2})-(\d{2})/).map { |e|
[:length, :width, :height].zip(e.map(&:to_i)).to_h }
}
end
#=> [{ :id=>"11000082734",
# :ref=>["D14-109-0", "D14-209-0"],
# :cmt=>[{ :length=>49, :width=>41, :height=>31 },
# { :length=>44, :width=>34, :height=>53 }
# ]
# },
# { :id=>"11000082735",
# :ref=>["D14-109-1", "D14-209-1"],
# :cmt=>[{ :length=>49, :width=>41, :height=>32 },
# { :length=>44, :width=>34, :height=>54 }
# ]
# }
# ]
答案 1 :(得分:0)
试试这个
(?<IdCode>\d{10,})|REF\/(?<ReferenceCode>\w{3}\-\d{3}\-\d)|CMT\/(?<Length>\d+)\-(?<Width>\d+)\-(?<Height>\d+)
<强>解释强>
( … )
:捕获小组sample
?
:一次或无sample
\
:逃脱一个特殊字符sample
|
:替代/或操作数sample
+
:一个或多个sample
输入
..some random text here..
ID/11000082734
REF/D14-109-0
REF/D14-209-0
REF/D14-219-0
CMT/59-40-25
CMT/38-25-28
CMT/59-40-25
CMT/37-37-20
CMT/40-40-20
CMT/37-37-20
CMT/49-41-31
CMT/44-34-53
输出:
MATCH 1
IdCode [29-40] `11000082734`
MATCH 2
ReferenceCode [45-54] `D14-109-0`
MATCH 3
ReferenceCode [59-68] `D14-209-0`
MATCH 4
ReferenceCode [73-82] `D14-219-0`
MATCH 5
Length [87-89] `59`
Width [90-92] `40`
Height [93-95] `25`
MATCH 6
Length [100-102] `38`
Width [103-105] `25`
Height [106-108] `28`
MATCH 7
Length [113-115] `59`
Width [116-118] `40`
Height [119-121] `25`
MATCH 8
Length [126-128] `37`
Width [129-131] `37`
Height [132-134] `20`
MATCH 9
Length [139-141] `40`
Width [142-144] `40`
Height [145-147] `20`
MATCH 10
Length [152-154] `37`
Width [155-157] `37`
Height [158-160] `20`
MATCH 11
Length [165-167] `49`
Width [168-170] `41`
Height [171-173] `31`
MATCH 12
Length [178-180] `44`
Width [181-183] `34`
Height [184-186] `53`