我有一个包含数百万行的postgres数据库,它有一个名为geom的列,其中包含属性的边界。
使用python脚本我从这个表中提取信息并将其重新插入到新表中。
当我在新表中插入时,脚本会出现以下错误:
Traceback (most recent call last):
File "build_parcels.py", line 258, in <module>
main()
File "build_parcels.py", line 166, in main
update_cursor.executemany("insert into parcels (par_id, street_add, title_no, proprietors, au_name, ua_name, geom) VALUES (%s, %s, %s, %s, %s, %s, %s)", inserts)
psycopg2.IntegrityError: new row for relation "parcels" violates check constraint "enforce_geotype_geom"
新表有一个检查约束enforce_geotype_geom =((geometrytype(geom)='POLYGON':: text)或(geom IS NULL))而旧表没有,所以我猜测theres dud数据或非多边形(也许多表格数据?)在旧表中。我希望将新数据保持为多边形,以便不要插入任何其他内容。
最初我尝试使用标准的python错误处理来包装查询,希望dud geom行会失败,但脚本会继续运行,但是脚本已写入最后提交而不是每行,所以它不起作用。
我认为我需要做的是遍历旧表格geom行并检查它们是什么类型的几何体,以便我可以确定是否要保留它或在插入新表之前将其丢弃< / p>
最好的办法是什么?
答案 0 :(得分:8)
这个令人惊讶的PostGIS SQL应该可以帮助你解决这个问题......这里有很多几何类型测试:
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
--
-- $Id: cleanGeometry.sql 2008-04-24 10:30Z Dr. Horst Duester $
--
-- cleanGeometry - remove self- and ring-selfintersections from
-- input Polygon geometries
-- http://www.sogis.ch
-- Copyright 2008 SO!GIS Koordination, Kanton Solothurn, Switzerland
-- Version 1.0
-- contact: horst dot duester at bd dot so dot ch
--
-- This is free software; you can redistribute and/or modify it under
-- the terms of the GNU General Public Licence. See the COPYING file.
-- This software is without any warrenty and you use it at your own risk
--
-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CREATE OR REPLACE FUNCTION cleanGeometry(geometry)
RETURNS geometry AS
$BODY$DECLARE
inGeom ALIAS for $1;
outGeom geometry;
tmpLinestring geometry;
Begin
outGeom := NULL;
-- Clean Process for Polygon
IF (GeometryType(inGeom) = 'POLYGON' OR GeometryType(inGeom) = 'MULTIPOLYGON') THEN
-- Only process if geometry is not valid,
-- otherwise put out without change
if not isValid(inGeom) THEN
-- create nodes at all self-intersecting lines by union the polygon boundaries
-- with the startingpoint of the boundary.
tmpLinestring := st_union(st_multi(st_boundary(inGeom)),st_pointn(boundary(inGeom),1));
outGeom = buildarea(tmpLinestring);
IF (GeometryType(inGeom) = 'MULTIPOLYGON') THEN
RETURN st_multi(outGeom);
ELSE
RETURN outGeom;
END IF;
else
RETURN inGeom;
END IF;
------------------------------------------------------------------------------
-- Clean Process for LINESTRINGS, self-intersecting parts of linestrings
-- will be divided into multiparts of the mentioned linestring
------------------------------------------------------------------------------
ELSIF (GeometryType(inGeom) = 'LINESTRING') THEN
-- create nodes at all self-intersecting lines by union the linestrings
-- with the startingpoint of the linestring.
outGeom := st_union(st_multi(inGeom),st_pointn(inGeom,1));
RETURN outGeom;
ELSIF (GeometryType(inGeom) = 'MULTILINESTRING') THEN
outGeom := multi(st_union(st_multi(inGeom),st_pointn(inGeom,1)));
RETURN outGeom;
ELSIF (GeometryType(inGeom) = '<NULL>' OR GeometryType(inGeom) = 'GEOMETRYCOLLECTION') THEN
RETURN NULL;
ELSE
RAISE NOTICE 'The input type % is not supported %',GeometryType(inGeom),st_summary(inGeom);
RETURN inGeom;
END IF;
End;$BODY$
LANGUAGE 'plpgsql' VOLATILE;
答案 1 :(得分:2)
选项1是在每次插入之前创建一个保存点,如果INSERT
失败,则回滚到该安全点。
选项2是将检查约束表达式作为WHERE
条件附加到生成数据的原始查询上,以避免选择它。
最佳答案取决于表的大小,错误行的相对数量,以及运行速度和频率。
答案 2 :(得分:0)
我认为你可以使用 ST_CollectionExtract - 给定(多)几何,返回仅由指定类型的元素组成的(多)几何。
我在插入ST_Intersection的结果时使用它,ST_Dump将任何多边形,集合分成单个几何体。然后ST_CollectionExtract (theGeom, 3)
丢弃除多边形以外的任何内容:
ST_CollectionExtract((st_dump(ST_Intersection(data.polygon, grid.polygon))).geom, )::geometry(polygon, 4326)
3
上方的第二个参数可以是:1 == POINT, 2 == LINESTRING, 3 == POLYGON