我将从查询中获取输出,如:
[ (14577692L, 'POINT(-122.106035882 37.397386475)'), (14577692L, 'POINT(-122.106035882 37.397386475)'), (14577692L, 'POINT(-122.106035882 37.397386475)') ]
我想单独获取POINT值,以使用正则表达式(例如。
)获取lat和long值_RE = re.compile('\(\([\d\-\., ]*\)\)')
for i in cursor.fetchall():
for p in _RE.findall(i[1]):
// I want latitude and longitude value from POINT(-122.106035882 37.397386475)
我的正则表达式错了。有人可以帮我纠正这个:
_RE = re.compile('\(\([\d\-\., ]*\)\)'))
答案 0 :(得分:5)
这不需要正则表达式。因为POINT()
的格式是静态的,所以您可以简单地切出包含坐标的字符串部分并将其拆分为空格:
resultset = [
(14577692L, 'POINT(-122.106035882 37.397386475)'),
(14577692L, 'POINT(-122.106035882 37.397386475)'),
(14577692L, 'POINT(-122.106035882 37.397386475)')
]
for row in resultset:
coordinatestring = row[1][6:-1]
lat, lon = (float(x) for x in coordinatestring.split(' '))
do_something_with(lat, lon)
切片符号[6:-1]
省略了原始字符串的前6个字符和最后一个字符,分别为POINT(
和)
。这留下了两个用空格分隔的数字,这很容易处理,如上所述。
如果绝对必须使用正则表达式,则应使用原始字符串以避免必须两次转义字符,并使用两个捕获组,以便区分第一个和第二个坐标:
>>> import re
>>> _RE = re.compile(r'POINT\(([-\d\.]+)\s([-\d\.]+)\)')
>>> _RE.groups
2
>>> _RE.search('POINT(-122.106035882 37.397386475)').groups()
('-122.106035882', '37.397386475')
尽管如此,即使是正则表达式也是过度的。既然你知道POINT()
的格式是静态的,你可以自己查找值,忽略字母和parens:
>>> _RE = re.compile(r'([-\d\.]+)\s([-\d\.]+)')
>>> _RE.search('POINT(-122.106035882 37.397386475)').groups()
('-122.106035882', '37.397386475')
此时它变得足够简单,指出你根本不需要正则表达式的可能性(我已经展示过)。质疑使用re
的必要性并考虑更简单的替代方案,这绝不是一个坏主意。
答案 1 :(得分:2)
更明确:
import re
p = re.compile(r"POINT\(([-\d\.]+)\s([-\d\.]+)\)")
data = [
(14577692L, 'POINT(-122.106035882 37.397386475)'),
(14577692L, 'POINT(-122.106035882 37.397386475)'),
(14577692L, 'POINT(-122.106035882 37.397386475)')
]
for record in data:
lat, lon = p.search(record[1]).groups()
print lat, lon
结果:
-122.106035882 37.397386475
-122.106035882 37.397386475
-122.106035882 37.397386475
您还可以获取包含命名变量的字典:
p = re.compile(r"POINT\((?P<lat>[-\d\.]+)\s(?P<lon>[-\d\.]+)\)")
...
for record in data:
coordinates = p.match(record[1]).groupdict()
print coordinates
结果:
{'lat': '-122.106035882', 'lon': '37.397386475'}
{'lat': '-122.106035882', 'lon': '37.397386475'}
{'lat': '-122.106035882', 'lon': '37.397386475'}
答案 2 :(得分:0)
POINT\((-?\d+(?:\.\d+)?)\s+(-?\d+(?:\.\d+)?)\)
试试这个。看看演示。
https://regex101.com/r/sH8aR8/32
import re
p = re.compile(r'POINT\((-?\d+(?:\.\d+)?)\s+(-?\d+(?:\.\d+)?)\)', re.IGNORECASE | re.DOTALL)
test_str = "[ (14577692L, 'POINT(-122.106035882 37.397386475)'), (14577692L, 'POINT(-122.106035882 37.397386475)'), (14577692L, 'POINT(-122.106035882 37.397386475)') ]"
re.findall(p, test_str)