谷歌地图网址的Java正则表达式?

时间:2017-03-24 17:41:49

标签: java regex google-maps

我想解析String中的所有谷歌地图链接。格式如下:

第一个例子 https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298

https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z

https://www.google.com/maps/place//@38.8976763,-77.0387185,17z

https://maps.google.com/maps/place//@38.8976763,-77.0387185,17z

https://www.google.com/maps/place/@38.8976763,-77.0387185,17z

https://google.com/maps/place/@38.8976763,-77.0387185,17z

http://google.com/maps/place/@38.8976763,-77.0387185,17z

https://www.google.com.tw/maps/place/@38.8976763,-77.0387185,17z

这些都是有效的谷歌地图网址(链接到白宫)

这是我试过的

String gmapLinkRegex = "(http|https)://(www\\.)?google\\.com(\\.\\w*)?/maps/(place/.*)?@(.*z)[^ ]*";
Pattern patternGmapLink = Pattern.compile(gmapLinkRegex , Pattern.CASE_INSENSITIVE);
Matcher m = patternGmapLink.matcher(s);
while (m.find()) {
  logger.info("group0 = {}" , m.group(0));
  String place = m.group(4); 
  place = StringUtils.stripEnd(place , "/"); // remove tailing '/'
  place = StringUtils.stripStart(place , "place/"); // remove header 'place/'
  logger.info("place = '{}'" , place);
  String latLngZ = m.group(5);
  logger.info("latLngZ = '{}'" , latLngZ);
}

它在简单的情况下工作,但仍然有bug ... 例如

需要后期处理来获取可选的place信息

并且它无法使用两个网址提取一行,例如:

s = "https://www.google.com/maps/place//@38.8976763,-77.0387185,17z " +
      " and http://google.com/maps/place/@38.8976763,-77.0387185,17z";

它应该是两个网址,但正则表达式匹配整行......

要点:

  • 整个网址应与group(0)匹配(包括第一个示例中的尾随data部分),
  • 在第一个示例中,如果删除缩放级别:17z,它仍然是有效的gmap网址,但我的正则表达式无法与之匹配。
  • 更容易提取可选的place信息
  • 必须使用Lat / Lng提取,缩放级别是可选的。
  • 能够在一行中解析多个网址
  • 能够处理maps.google.com(.xx)/maps,我尝试了(www|maps\.)?,但似乎还有错误

有任何改善此正则表达式的建议吗?非常感谢!

2 个答案:

答案 0 :(得分:2)

点号星号

.*

将始终允许任何内容到最后一个网址的末尾。 你需要“更严格”的正则表达式,它匹配单个URL但不包含几个与之间的任何内容。 “[^] *”可能包含下一个URL,如果它由“”之外的其他内容分隔,其中包括换行符,制表符,移位空格......

我建议(对不起,没有在java上测试),使用“除了@”和“数字,减号,逗号或点”之外的任何内容和“可选的特殊字符串,然后是多次定制的字符集”。

"(http|https)://(www\.)?google\.com(\.\w*)?/maps/(place/[^@]*)?@([0123456789\.,-]*z)(\/data=[\!:\.\-0123456789abcdefmsx]+)?"

我在perl-regex兼容引擎(np ++)上测试了上面的那个 如果我猜错了,请适应自己。明确的数字列表可能会被“\ d”替换,我试图最小化正则表达式的假设。

为了匹配“URL”或“URL和URL”,请使用存储正则表达式的变量,然后执行“(URL和)* URL”,将“URL”替换为regex var。 (假设这在java中是可能的。)如果问题是如何检索多个匹配:那是java,我无法帮助。让我知道,我删除这个答案,而不是挑起应得的downvotes ;-)

(编辑以捕获数据部分,先前未见过,第一个示例,第一行;以及一行中的多个URL。)

答案 1 :(得分:0)

我编写了此正则表达式来验证Google地图链接:

"(http:|https:)?\\/\\/(www\\.)?(maps.)?google\\.[a-z.]+\\/maps/?([\\?]|place/*[^@]*)?/*@?(ll=)?(q=)?(([\\?=]?[a-zA-Z]*[+]?)*/?@{0,1})?([0-9]{1,3}\\.[0-9]+(,|&[a-zA-Z]+=)-?[0-9]{1,3}\\.[0-9]+(,?[0-9]+(z|m))?)?(\\/?data=[\\!:\\.\\-0123456789abcdefmsx]+)?"

我使用以下Google地图链接列表进行了测试:

String location1 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location2 = "https://www.google.com.tw/maps/place/@38.8976763,-77.0387185,17z";
String location3 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location4 = "https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298";
String location5 = "https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z";
String location6 = "https://www.google.com/maps/place//@38.8976763,-77.0387185,17z";
String location7 = "https://maps.google.com/maps/place//@38.8976763,-77.0387185,17z";
String location8 = "https://www.google.com/maps/place/@38.8976763,-77.0387185,17z";
String location9 = "https://google.com/maps/place/@38.8976763,-77.0387185,17z";
String location10 = "http://google.com/maps/place/@38.8976763,-77.0387185,17z";
String location11 = "https://www.google.com/maps/place/@/data=!4m2!3m1!1s0x3135abf74b040853:0x6ff9dfeb960ec979";
String location12 = "https://maps.google.com/maps?q=New+York,+NY,+USA&hl=no&sll=19.808054,-63.720703&sspn=54.337928,93.076172&oq=n&hnear=New+York&t=m&z=10";
String location13 = "https://www.google.com/maps";
String location14 = "https://www.google.fr/maps";
String location15 = "https://google.fr/maps";
String location16 = "http://google.fr/maps";
String location17 = "https://www.google.de/maps";
String location18 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location19 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location20 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location21 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location22 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location23 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location24 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location25 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location26 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location27 = "http://google.com/maps/bylatlng?lat=21.01196022&lng=105.86298748";
String location28 = "https://www.google.com/maps/place/C%C3%B4ng+vi%C3%AAn+Th%E1%BB%91ng+Nh%E1%BA%A5t,+354A+%C4%90%C6%B0%E1%BB%9Dng+L%C3%AA+Du%E1%BA%A9n,+L%C3%AA+%C4%90%E1%BA%A1i+H%C3%A0nh,+%C4%90%E1%BB%91ng+%C4%90a,+H%C3%A0+N%E1%BB%99i+100000,+Vi%E1%BB%87t+Nam/@21.0121535,105.8443773,13z/data=!4m2!3m1!1s0x3135ab8ee6df247f:0xe6183d662696d2e9";