Question

我正在尝试使用python GDAL / OGR模块从OSM.PBF文件中提取数据。

目前我的代码如下：

import gdal, ogr

osm = ogr.Open('file.osm.pbf')

## Select multipolygon from the layer
layer = osm.GetLayer(3) 
# Create list to store pubs
pubs = []
for feat in layer:
    if feat.GetField('amenity') == 'pub':
         pubs.append(feat)

虽然这一小段代码可以正常使用small.pbf文件（15mb）。但是，在解析大于50mb的文件时，我收到以下错误：

 ERROR 1: Too many features have accumulated in points layer. Use OGR_INTERLEAVED_READING=YES MODE

当我打开此模式时：

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')

即使解析小文件，

ogr也不会再返回任何功能。

有谁知道这里发生了什么？

Answer 1

感谢scai的回答，我能够弄明白。

gdal.org/1.11/ogr/drv_osm.html中提到的交错阅读所需的特殊阅读模式被翻译成一个可以在下面找到的工作python示例。

这是一个如何提取.osm.pbf文件中具有＆＃39; amenity = pub＆＃39;标签

import gdal, ogr

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')
osm = ogr.Open('file.osm.pbf')

# Grab available layers in file
nLayerCount = osm.GetLayerCount()

thereIsDataInLayer = True

pubs = []

while thereIsDataInLayer:

    thereIsDataInLayer = False

    # Cycle through available layers
    for iLayer in xrange(nLayerCount):

        lyr=osm.GetLayer(iLayer)

        # Get first feature from layer
        feat = lyr.GetNextFeature()

        while (feat is not None):

             thereIsDataInLayer = True

             #Do something with feature, in this case store them in a list
             if feat.GetField('amenity') == 'pub':
                 pubs.append(feat)

             #The destroy method is necessary for interleaved reading
             feat.Destroy()

             feat = lyr.GetNextFeature()

据我所知，需要一个while循环而不是for循环，因为在使用交错读取方法时，不可能获得集合的featurecount。

更多关于为什么这段代码像它一样工作的澄清将非常感激。

使用GDAL / OGR python模块解析osm.pbf数据

1 个答案: