","(.*?) (Railway Station)
我需要以下列格式从原始数据列表中提取工作站城市和纬度/经度:
"22238","Flinders Street Railway Station (Melbourne City)","-37.8183051340585","144.966964346166"
"22239","North Melbourne Railway Station (West Melbourne)","-37.8063098353473","144.94151017321"
"22240","Footscray Railway Station (Footscray)","-37.8014134330439","144.902020057667"
"22241","Sunshine Railway Station (Sunshine)","-37.7885363319246","144.832878204953"
所需的输出是:
Flinders Street
-37.8183051340585,144.966964346166
North Melbourne
-37.8063098353473,144.94151017321
Footscray
-37.8014134330439,144.902020057667
Sunshine
-37.7885363319246,144.832878204953
我就如何处理这个问题提出了一些建议。
使用","(.*?) (Railway Station)
提取了城市城市,但还有另外两个不受欢迎的匹配:
","Flinders Street Railway Station
Flinders Street
Railway Station
在上面,我怎么能单独匹配Flinders Street
?
其次,要检索纬度和经度,我应该执行单独的正则表达式调用,还是只需一个搜索模式就可以实现这一点?
最后,我应该使用正则表达式或之后的代码中的纬度/经度删除引号,将问题分成两个步骤。
例如,来自:
"-37.8183051340585","144.966964346166"
于:
-37.8183051340585,144.966964346166
或者,我从错误的角度来看这个问题,用,
作为分隔符来分割文本会更简单,然后将焦点放在具有特定模式的较小子串上。你有什么想法?
答案 0 :(得分:2)
你可以这样做:
"([^"]+)\s+Railway\sStation[^,]+,"([^"]+)","([^"]+)"$
\1
为Flinders Street
,\2
为-37.8183051340585
,\3
为144.966964346166
。
答案 1 :(得分:1)
您的正则表达式","(.*?) (Railway Station)
匹配,正如您在示例中所说,","Flinders Street Railway Station
。它捕获 Flinders Street
和Railway Station
。注意匹配和捕获之间的区别。匹配是正则表达式匹配的一切(当然)。捕获是()
中包含的正则表达式的一部分。与您的(.*?)
- 和 - (Railway Station)
一样,为您提供两个捕获组。
要匹配,而不是捕获,Railway Station
删除括号 - 例如","(.*?) Railway Station
。这将匹配与您的相同,但只有捕获该电台。然后,要在不捕获城市的情况下进行匹配,请添加与括号和其他任何内容匹配的\([^)]*\)
。
最后,添加","([^"]*)","([^"]*)"
将捕获另外两个捕获组中的坐标,给出最终的
","(.*?) Railway Station \([^)]*\)","([^"]*)","([^"]*)"
完成这项工作。
此致