如何简化函数以确定字符串是否包含描述多边形的坐标?

时间:2017-07-19 14:17:27

标签: python

我有以下字符串:

points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"

points是一个字符串,其中包含以逗号分隔的坐标值(纬度和经度)。

我想检查这个字符串是否只包含整数或浮点值,并且第一个坐标等于最后一个坐标。

我有以下代码:

def validate_points(points):
   coordinates = points.split(',')

   for point in coordinates:
      latlon = point.split(' ')

      latitude = latlon[0]
      longitude = latlon[1]
      if not is_number(latitude) or not is_number(longitude):
         raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")

   first = coordinates[0]
   last = coordinates[len(coordinates) - 1]
   if first != last:
        raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")

def is_number(s):
   try:
     if float(s) or int(s):
        return True
   except ValueError:
        return False

有没有办法简化或加快这段代码?

5 个答案:

答案 0 :(得分:2)

您的输入几乎看起来像WKT polygon

使用shapely包,你可以简单地尝试将这些点解析为WKT,看看会发生什么,根据Python的"Easier to ask for forgiveness than permission"原则:

# pip install shapely
from shapely import wkt

def is_well_defined_polygon(points):
    try:
        wkt.loads("POLYGON((%s))" % points)
        return True
    except:
        return False

points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748, 34.09352 -118.27483"

print(is_well_defined_polygon(points))
# True
print(is_well_defined_polygon("1 2, 3 4"))
# IllegalArgumentException: Points of LinearRing do not form a closed linestring
# False
print(is_well_defined_polygon("a b c d"))
# ParseException: Expected number but encountered word: 'a'
# False

答案 1 :(得分:0)

以下是一些改进。您可以稍微加快is_number函数的速度,并使用coordinates[-1]代替`coordinates [len(coordinates)-1]。您也不一定需要定义所有这些变量:

def validate_points(points):
   coordinates = points.split(',')

   for point in coordinates:
       latitude, longitude = point.split(' ', 1)

       if not is_number(latitude) or not is_number(longitude):
          raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")

   if coordinates[0] != coordinates[- 1]:
       raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")
def is_number(s):
   try:
        return (float(s) or int(s) or True)
   except ValueError:
       return False

答案 2 :(得分:0)

小事:

  • 使用coordinates[-1]代替coordinates[len(coordinates)-1]
  • 使用latitude, longitude = point.split(' ', 1)。这会导致3.41 47.11 foobar等案件无效。
  • 您真的需要latitudelongitude成为字符串吗?您可能需要float / int值,因此is_number应该类似于
    def conv_number(s):
        try:
            return float(s)
        except ValueError:
            try:
                return int(s)
            except ValueError:
                raise WrongRequestDataError(s)
    

我特别喜欢你不使用isinstance来检查float / int这一事实:在python中,你应该总是能够传递一个像int或{{ 1}}如果被要求这样做。

答案 3 :(得分:0)

我就是这样做的:

points = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"


def validate_points(points):
    separate = points.split(',')

    try:
        [float(y) for x in separate for y in x.split()]
    except ValueError:
        return False

    return separate[0] == separate[-1]

print(validate_points(points))  # False

如果您确实想要引发错误,可以按如下方式修改\简化代码:

def validate_points(points):
    separate = points.split(',')
    [float(y) for x in separate for y in x.split()]  # orphan list-comprehension
    if not separate[0] == separate[-1]:
        raise ValueError

答案 4 :(得分:0)

使用正则表达式命名组过滤数据的解决方案:

# -*- coding: utf-8 -*-
import re


class WrongRequestDataError(Exception):
    pass

def position_equal(pos1, pos2):
    # retrun pos1 == pos2 # simple compare
    accuracy = 0.005
    return (
        abs(float(pos1['latitude']) - float(pos2['latitude'])) <= accuracy and
        abs(float(pos1['longitude']) - float(pos2['longitude'])) <= accuracy
    )

test_str = "34.09352 -118.27483, 34.0914 -118.2758, 34.082 -118.2782, 34.0937 -118.2769, 34.0933 -118.2748"

regex = r"(?P<position>(?P<latitude>\-?\d+(\.\d+)?) (?P<longitude>\-?\d+(\.\d+)?))"
matches = re.finditer(regex, test_str, re.IGNORECASE)

matched = []
for matchNum, match in enumerate(matches):
    matched.append({
        'latitude': match.group('latitude'),
        'longitude': match.group('longitude'),
    })

matched_count = len(matched)
if matched_count != test_str.count(',') + 1:
    raise WrongRequestDataError("Please, specify the correct type of points value. It must be a numeric value")
else:
    if matched_count > 1:
        if not position_equal(matched[0], matched[-1]):
            raise WrongRequestDataError("Incorrect points format, the first point must be equal to last")

您可以修改position_equal函数中的精度值,以便在比较第一个和最后一个位置时更改准确度。

您可以在regex101上测试或调试正则表达式:https://regex101.com/r/tYYJXN/1/