如何将文件作为多个多边形的嵌套坐标列表读取?

时间:2014-09-20 23:13:19

标签: python regex python-3.x file-io matplotlib

我有一个包含以下许多部分的文件:

[40.742742,-73.993847]
[40.739389,-73.985667]
[40.74715499999999,-73.97992]
[40.750573,-73.988415]
[40.742742,-73.993847]

[40.734706,-73.991915]
[40.736917,-73.990263]
[40.736104,-73.98846]
[40.740315,-73.985263]
[40.74364800000001,-73.993353]
[40.73729099999999,-73.997988]
[40.734706,-73.991915]

[40.729226,-74.003463]
[40.7214529,-74.006038]
[40.717745,-74.000389]
[40.722299,-73.996634]
[40.725291,-73.994413]
[40.729226,-74.003463]
[40.754604,-74.007836]
[40.751289,-74.000649]
[40.7547179,-73.9983309]
[40.75779,-74.0054339]
[40.754604,-74.007836]

我需要在每个部分中读取一对坐标列表(每个部分用额外的\n分隔)。

在我有一个类似的文件中(除了没有额外的换行符之外),我从整个文件中绘制一个多边形。我可以使用以下代码读取坐标并在matplotlib中绘制它:

mVerts = []
with open('Manhattan_Coords.txt') as f:
    for line in f:
        pair = [float(s) for s in line.strip()[1:-1].split(", ")]
        mVerts.append(pair)

plt.plot(*zip(*mVerts))
plt.show()

如何完成相同的任务,除了多个多边形,我的文件中的每个多边形都被一个额外的换行符隔开?

3 个答案:

答案 0 :(得分:4)

这是我个人最喜欢的方法,将文件“分块”成由空格分隔的事物组:

from itertools import groupby

def chunk_groups(it):
     stripped_lines = (x.strip() for x in it)
     for k, group in groupby(stripped_lines, bool):
         if k:
             yield list(group)

我建议ast.literal_eval将列表的字符串表示转换为实际的python列表:

from ast import literal_eval

with open(filename) as f:
     result = [[literal_eval(li) for li in chunk] for chunk in chunk_groups(f)]

给出:

result
Out[66]: 
[[[40.742742, -73.993847],
  [40.739389, -73.985667],
  [40.74715499999999, -73.97992],
  [40.750573, -73.988415],
  [40.742742, -73.993847]],
 [[40.734706, -73.991915],
  [40.736917, -73.990263],
  [40.736104, -73.98846],
  [40.740315, -73.985263],
  [40.74364800000001, -73.993353],
  [40.73729099999999, -73.997988],
  [40.734706, -73.991915]],
 [[40.729226, -74.003463],
  [40.7214529, -74.006038],
  [40.717745, -74.000389],
  [40.722299, -73.996634],
  [40.725291, -73.994413],
  [40.729226, -74.003463],
  [40.754604, -74.007836],
  [40.751289, -74.000649],
  [40.7547179, -73.9983309],
  [40.75779, -74.0054339],
  [40.754604, -74.007836]]]

答案 1 :(得分:2)

使用json代替ast,对于roippi的想法略有不同,

import json
from itertools import groupby

with open(FILE, "r") as coodinates_file:
    grouped = groupby(coodinates_file, lambda line: line.isspace())
    groups = (group for empty, group in grouped if not empty)

    polygons = [[json.loads(line) for line in group] for group in groups]
from pprint import pprint
pprint(polygons)
#>>> [[[40.742742, -73.993847],
#>>>   [40.739389, -73.985667],
#>>>   [40.74715499999999, -73.97992],
#>>>   [40.750573, -73.988415],
#>>>   [40.742742, -73.993847]],
#>>>  [[40.734706, -73.991915],
#>>>   [40.736917, -73.990263],
#>>>   [40.736104, -73.98846],
#>>>   [40.740315, -73.985263],
#>>>   [40.74364800000001, -73.993353],
#>>>   [40.73729099999999, -73.997988],
#>>>   [40.734706, -73.991915]],
#>>>  [[40.729226, -74.003463],
#>>>   [40.7214529, -74.006038],
#>>>   [40.717745, -74.000389],
#>>>   [40.722299, -73.996634],
#>>>   [40.725291, -73.994413],
#>>>   [40.729226, -74.003463],
#>>>   [40.754604, -74.007836],
#>>>   [40.751289, -74.000649],
#>>>   [40.7547179, -73.9983309],
#>>>   [40.75779, -74.0054339],
#>>>   [40.754604, -74.007836]]]

答案 2 :(得分:2)

在已发布的答案中采用了许多漂亮的方法。其中任何一个都没有错。

然而,采用明显但可读的方法也没有错。

另外,您似乎正在处理地理数据。这种格式是你一直都会遇到的,而分段分隔符通常不像额外换行那样明显。 (有很多相当糟糕的特殊“ascii导出”格式,特别是在不起眼的专有软件中。例如,一种常见格式在段中最后一行的末尾使用F作为分隔符(即1.0 2.0F)。许多其他人根本不使用分隔符,并且如果距离最后一个点的距离超过“x”,则需要启动一个新的分段/多边形。) ,这些东西经常成为多GB的ascii文件,因此将整个内容读入内存可能是不切实际的。


我的观点是:无论您选择哪种方法,都要确保理解它。你将再次这样做,而且它将变得非常不同,难以概括。你绝对应该 学习像itertools这样的库,但要确保你完全理解你正在调用的函数。


这是“明显但可读”方法的一个版本。它更加冗长,但没有人会对它的作用感到头疼。 (你可以用几种略有不同的方式编写这个相同的逻辑。使用对你最有意义的东西。)

import matplotlib.pyplot as plt

def polygons(infile):
    group = []
    for line in infile:
        line = line.strip()
        if line:
            coords = line[1:-1].split(',')
            group.append(map(float, coords))
        else:
            yield group
            group = []
    else:
        yield group

fig, ax = plt.subplots()
ax.ticklabel_format(useOffset=False)

with open('data.txt', 'r') as infile:
    for poly in polygons(infile):
        ax.plot(*zip(*poly))

plt.show()

enter image description here