目前我的正则表达式如下:
(?<country>United States): (?<dial_number>\+1([ ]*[()\d\-\.]+)+)|(?<country>Australia): (?<dial_number>\+61([ ]*[()\d\-\.]+)+)|(?<country>Canada): (?<dial_number>\+1([ ]*[()\d\-\.]+)+)|(?<country>United Kingdom): (?<dial_number>\+44([ ]*[()\d\-\.]+)+)|(?<country>New Zealand): (?<dial_number>\+64([ ]*[()\d\-\.]+)+)
一个看起来像这样的字符串(假数字):
Test Meeting
Mon, Jan 15, 2018 10:00 AM - 5:00 PM AEST
Please join my meeting from your computer, tablet or smartphone.
https://example.com/join/50263834
You can also dial in using your phone.
Australia: +61 2 9037 3201
Access Code: 204-761-833
More phone numbers
United States: +1 (571) 417-3429
Austria: +43 7 1081 5425
Belgium: +32 28 92 6018
Canada: +1 (647) 467-9333
Denmark: +45 32 72 01 62
Finland: +358 523 16 0568
France: +33 159 950 514
Germany: +49 692 5536 7287
Ireland: +353 12 360 548
Italy: +39 0 237 92 48 01
Netherlands: +31 107 841 377
New Zealand: +64 9 260 6012
Norway: +47 21 09 36 51
Spain: +34 972 75 2103
Sweden: +46 253 098 826
Switzerland: +41 225 3290 67
United Kingdom: +44 17 3515 4021
First Meeting? Let's do a quick system check: https://example.com/system-check
我想按照编写顺序匹配正则表达式。如果澳大利亚队第一次回归比赛,那就意味着,如果美国队第一,那就回归比赛。
目前,在字符串中首先显示的是匹配的内容。在上面的例子中将是澳大利亚。
有没有办法可以在正则表达式的优先级列表中返回最早的匹配?
答案 0 :(得分:2)
正则表达式不适合这种排序。我深信您应该以任何顺序匹配所有值,然后根据参考数组的顺序对结果进行排序。
这是一个小例子:
matches = {"Australia"=>"+61 2 9037 3201",
"United States"=>"+1 (571) 417-3429",
"Canada"=>"+1 (647) 467-9333",
"New Zealand"=>"+64 9 260 6012",
"United Kingdom"=>"+44 17 3515 4021"}
order = ["United States",
"Australia",
"Canada",
"United Kingdom",
"New Zealand"]
puts matches.sort_by { |element| order.index(element.first) }
答案 1 :(得分:2)
我们给出以下字符串。
str=<<BITTER_END
Test Meeting
Mon, Jan 15, 2018 10:00 AM - 5:00 PM AEST
Please join my meeting from your computer, tablet or smartphone.
https://example.com/join/50263834
You can also dial in using your phone.
Australia: +61 2 9037 3201
Access Code: 204-761-833
More phone numbers
United States: +1 (571) 417-3429
Austria: +43 7 1081 5425
Belgium: +32 28 92 6018
Canada: +1 (647) 467-9333
Denmark: +45 32 72 01 62
Finland: +358 523 16 0568
France: +33 159 950 514
Germany: +49 692 5536 7287
Ireland: +353 12 360 548
Italy: +39 0 237 92 48 01
Netherlands: +31 107 841 377
New Zealand: +64 9 260 6012
Norway: +47 21 09 36 51
Spain: +34 972 75 2103
Sweden: +46 253 098 826
Switzerland: +41 225 3290 67
United Kingdom: +44 17 3515 4021
First Meeting? Let's do a quick system check: https://example.com/system-check
BITTER_END
我倾向于首先从这个字符串创建一个哈希,其字符串是国家名称,其值是电话号码。
r = /
^ # match start of line
(?<country>[\p{L} ]+) # match >= 1 letters and spaces in named group country
:[ ]+ # match a colon and >= 1 spaces
(?<dial_number> # begin a named group dial_mumber
\+ # match a literal +
(?: # begin a non-capture group
# US and Canada
1[ ]+ # match 1 followed by >= 1 spaces
\(\d{3}\) # match a left paren, 3 digits, a right paren
[ ]+ # match >= 1 spaces
\d{3}\-\d{4} # match 3 digits, a dash and 4 digits
| # or
# rest of world
\d{2,3} # match 2 or 3 digits
(?: # begin a non-capture group
[ ]+ # match >=1 spaces
\d{1,4} # match 1 to 4 digits
){3,5} # close non-capture group and perform 3-5 times
) # close non-capture group
) # close named group dial_number
/x # free-spacing regex definition mode
h = str.each_line.with_object({}) do |line, h|
m = line.match r
h[m[:country]] = m[:dial_number] unless m.nil?
end
#=> {"Australia"=>"+61 2 9037 3201", "United States"=>"+1 (571) 417-3429",
# "Austria"=>"+43 7 1081 5425", "Belgium"=>"+32 28 92 6018",
# ...
# "Switzerland"=>"+41 225 3290 67", "United Kingdom"=>"+44 17 3515 4021"}
然后我们可以通常的方式检索电话号码。
h["United States"]
#=> "+1 (571) 417-3429"
h["Shangri-La"]
#=> nil
如果您拥有国家/地区的优先级列表,并希望找到h
中的第一个密钥,并检索其电话号码,请执行以下操作。
priority = ["Fiji", "Shangri-La", "United States", "Finland"]
country = priority.find { |country| h.key?(country) }
#=> "United States"
country ? [country, h[country]] : nil
#=> ["United States", "+1 (571) 417-3429"]