Problems Reading Zip of Shapefiles without loading memory

时间:2019-03-17 22:32:19

标签: python-3.x macos zipfile shapefile stringio

I've been trying to adapt Andrew Gaidus shapefile reading routine for my needs. The Jupyter Notebook I'm using acts like it partitioned the disk of my MacBook Pro so I can't read or write to disk. Gaidus has a good procedure for avoiding using disk, but is written for prior version of Python.

Here is the code:

    dls = "https://github.com/ItsMeLarry/Coursera_Capstone/raw/master/tl_2010_25009_tract00%202.zip"
lynntracts = ZipFile(io.BytesIO(urllib.request.urlopen(dls).read()))
print("Done")

filenames = [y for y in sorted(lynntracts.namelist()) for ending in ['dbf', 'prj', 'shp', 'shx'] if y.endswith(ending)] 
#For some reason, I get 8, instead of 4, filenames.  The first 4 start with __MACOSX. I get rid of those. The problem I
#have with the 'TypeError' occurs no matter which set of 4 files I use.
print(filenames[0], 'Example of the 4 files that I remove in the for loop')
for i in range(0,4):
     del filenames[0]
print(filenames)
dbf, prj, shp, shx = [io.StringIO(ZipFile.read(filename)) for filename in filenames]
r = shapefile.Reader(shp=shp, shx=shx, dbf=dbf)
print(r.numRecords)

Opening with io.BytesIO cured the prior problem of byte/str collision. Now see the TypeError for the ZipFile.read. I get the same error if I use io.BytesIO when calling it. Here is error output followed by error info:

Done __MACOSX/tl_2010_25009_tract00/._tl_2010_25009_tract00.dbf Example of the 4 files that I remove in the for loop

['tl_2010_25009_tract00/tl_2010_25009_tract00.dbf', 'tl_2010_25009_tract00/tl_2010_25009_tract00.prj', 'tl_2010_25009_tract00/tl_2010_25009_tract00.shp', 'tl_2010_25009_tract00/tl_2010_25009_tract00.shx']

TypeError Traceback (most recent call last) in () 12 del filenames[0] 13 print(filenames) ---> 14 dbf, prj, shp, shx = [io.StringIO(ZipFile.read(filename)) for filename in filenames] 15 r = shapefile.Reader(shp=shp, shx=shx, dbf=dbf) 16 print(r.numRecords)

in (.0) 12 del filenames[0] 13 print(filenames) ---> 14 dbf, prj, shp, shx = [io.StringIO(ZipFile.read(filename)) for filename in filenames] 15 r = shapefile.Reader(shp=shp, shx=shx, dbf=dbf) 16 print(r.numRecords)

TypeError: read() missing 1 required positional argument: 'name'

Clearly, I am a beginner. I've come up empty handed trying to research this. Where do I go? What do I need to understand here? Thanks

0 个答案:

没有答案