Question

我正在尝试创建一个python脚本，用于识别带有Y或N的点shapefile中的重复记录（可能超过5000条记录）。与此类似：

xyCombine | dplicate

E836814.148873 N814378.125749 |

E836815.033548 N814377.614688 |

E836818.016542 N814371.411850 |

我希望处理字段xyCombine以获取重复项，并使用Y或N更新另一个字段（dplicate），如果它是重复的话。以期望的结果：

xyCombine | dplicate

E836814.148873 N814378.125749 | ÿ

E836814.148873 N814378.125749 |是的

E836815.033548 N814377.614688 | ÿ

E836818.016542 N814371.411850 | Ñ

以下是我的尝试：

# Process: Searches xyCombine field for any duplicates
duplicateCount = 0
inShapefile = pointsShapefile
fieldName = "xyCombine"
shpFRows = arcpy.UpdateCursor(inShapefile)
shpFRow = shpFRows.next()
fieldList = []
while shpFRow:
    if shpFRow.isNull(fieldName) == False and len(str(shpFRow.getValue(fieldName)).strip()) > 1:
            fieldList.append(shpFRow.getValue(fieldName))
    shpFRow = shpFRows.next()
duplicateList = [x for x, y in collections.Counter(fieldList).items() if y > 1]
print duplicateList
selectFile = pointsShapefile
selectFields = ('xyCombine','dupCHK')
shpFRows = arcpy.UpdateCursor(selectFile,selectFields)
shpFRow1 = shpFRows.next()
while shpFRow1:
    if shpFRow1.isNull(fieldName) == False and len(str(shpFRow1.getValue(fieldName)).strip()) > 1:
        for row in duplicateList:
            if shpFRow1.getValue(fieldName) == row:
                duplicate += 1
                row[1] = "Y"
            else:
                row[1] = "N"
            cursor.updateRow(row)
        shpFRow1 = shpFRows.next()
if duplicateCount > 0:
    print ""
    print "*** "+str(duplicate)+" duplicated points. ***"
    print ""

如果我不包括

    row[1] = "Y"
else:
    row[1] = "N"
cursor.updateRow(row)

脚本正确执行打印重复的总量，但是不会使用Y或N值更新字段重复项，这很重要，因为它将在脚本后面提供csv错误报告。

但是当我确实包含它时，我收到以下错误消息：

在win32上的Python 2.7.2（默认，2011年6月12日，15：08：59）[MSC v.1500 32位（英特尔）]

[u＆＃39; E836814.148873 N814378.125749＆＃39;，u＆＃39; E836815.033548 N814377.614688＆＃39;，u＆＃39; E836818.016542 N814371.41185＆＃39;]

追踪（最近一次通话）：文件＆＃34; C：\ Duplicate Points Check \ Python Scripts \ DuplicatePointsCheck_TEST1.py＆＃34;，第458行，in DuplicatePointsCheck（）文件＆＃34; C：\ Duplicate Points在DuplicatePointsCheck中检查\ Python Scripts \ DuplicatePointsCheck_TEST1.py＆＃34;，第94行 row [1] =＆＃34; N＆＃34; TypeError：＆＃39; unicode＆＃39;对象不支持项目分配＆gt;＆gt;＆gt;

据我所知，ArcGIS中的工具可以通过现场计算器提供可能的解决方案。但是我想加强对Python的理解，因为我对Python很陌生。如果以前提出过这个问题我很抱歉，但我已经在互联网上搜索，我搜索的唯一结果包括找到并删除重复的记录。如果你们中的任何人能引导我朝着正确的方向前进，那将会有很大的帮助。提前谢谢。

Answer 1

没有足够的信息可以确定，但您似乎正在使用ArcGIS 10.1或更新版本。如果是这种情况，看起来您似乎正在尝试使用UpdateCursor的新数据访问版本，但实际上正在调用Update＆lt; 10.0版本的UpdateCursor。

我最近没有使用过ArCGIS 10.0，但从文档中可以看出语法已经改变了。 ArcGIS文档lists this method，用于使用UpdateCursor为字段分配值：

for row in cursor:
    # field2 will be equal to field1 multiplied by 3.0
    row.setValue(field2, row.getValue(field1) * 3.0)
    cursor.updateRow(row)

您似乎再次使用数据访问语法，如下所示from the ArcGIS 10.2 documentation：

with arcpy.da.UpdateCursor(fc, fields) as cursor:
# Update the field used in Buffer so the distance is based on road 
# type. Road type is either 1, 2, 3 or 4. Distance is in meters. 
for row in cursor:
    # Update the BUFFER_DISTANCE field to be 100 times the 
    # ROAD_TYPE field.
    row[1] = row[0] * 100
    cursor.updateRow(row)

确保您使用arcpy.da.UpdateCursor创建光标;我希望这能解决你的问题。

在字段中查找重复项，然后使用Python（ArcGIS）使用Y或N更新另一个字段

1 个答案: